NLICollator

Reference API for NLICollator

class liqfit.collators.NLICollator

(tokenizer: AutoTokenizer, max_length: int, padding: Union[bool, str], truncation: bool)

Parameters:

  • tokenizer (AutoTokenizer, Callable): The tokenizer used to process the input data from texts to input IDs.

  • max_length (int): Max length that will be used while tokenizing the input sequences.

  • padding: (Union[bool, str]): Option to specify whether to use pad sequences while tokenization or not.

  • truncation (bool): Option to specify whether to use truncate sequences while tokenization or not.

Using NLICollator

from liqfit.collators import NLICollator
from liqfit.datasets import NLIDataset
from torch.utils.data import DataLoader

dataset = NLIDataset(....)
collator = NLICollator(....)
dataloader = DataLoader(dataset, collate_fn=collator)

# OR

from transformers import Trainer
trainer = Trainer(train_dataset=dataset, data_collator=collator)
```

Last updated