The number of tokens in a string or vector of strings
Usage
ntokens(
x,
model = getOption("pangoling.causal.default"),
add_special_tokens = NULL,
config_tokenizer = NULL
)
Arguments
- x
character input
- model
Name of a pre-trained model or folder.
- add_special_tokens
Whether to include special tokens. It has the same default as the AutoTokenizer method in Python.
- config_tokenizer
List with other arguments that control how the tokenizer from Hugging Face is accessed.
See also
Other token-related functions:
tokenize_lst()
,
transformer_vocab()
Examples
if (FALSE) { # interactive()
ntokens(x = c("The apple doesn't fall far from the tree."), model = "gpt2")
}