gatelfpytorchjson.modelwrapperdefault module

class gatelfpytorchjson.modelwrapperdefault.ModelWrapperDefault(dataset, config={}, cuda=None)[source]

Bases: gatelfpytorchjson.modelwrapper.ModelWrapper

apply(instancelist, converted=False, reshaped=False)[source]

Given a list of instances in original format (or converted if converted=True), applies the model to them in evaluation mode and returns the following: As the first return value, the batch of predictions. This is a list of values (1 value for each instance in the batch) for classification and a list of lists (1 list representing a sequence for each instance in the batch) for sequence tagging. As the second value, returns the score/s for the returned predictions. This has the same shape as the first return value, but returns a score instead of each label. As the third value, returns a batch of confidence/scoring values. For classification, this is a list of lists, where the inner list is the label distribution. For sequence tagging, this is a list of list of lists, again with the label distribution as the inner-most list. Not that the mapping between the index of a value in the label distribution and the label itself can be figured out by the caller by retrieving the target vocab first. This may return additional data in the future or the format of what is returned may change.

checkpoint(filenameprefix, checkpointnr=None)[source]

Save the module, adding a checkpoint number in the name.

evaluate(validationinstances, train_mode=False, as_pytorch=True)[source]

Apply the model to the independent part of the validationset instances and use the dependent part to evaluate the predictions. The validationinstances must be in batch format. Returns a tuple of loss and accuracy. By default this returns the loss as a pyTorch variable and accuracy as a pytorch tensor, if as_pytorch is set to False, returns floats instead. If prepared=True then validationinstances already contains everything as properly prepared PyTorch Variables.

get_module()[source]

Return the PyTorch module that has been built and is used by this wrapper.

init_after_load(filenameprefix)[source]
init_classification(dataset)[source]
init_from_dataset()[source]

Set the convenience attributes which we get from the dataset instance

init_sequencetagging(dataset)[source]

Build the module for sequence tagging.

prepare_data(validationsize=None, file=None)[source]

If file is not None, use the content of the file, ignore the size. If validationsize is > 1, it is the absolute size, if < 1 it is the portion e.g. 0.01 to use.

save(filenameprefix)[source]
save_model(filenameprefix)[source]
set_cuda(flag)[source]

Advise to use CUDA if flag is True, or CPU if false. True is ignored if cuda is not available

train(max_epochs=20, batch_size=20, early_stopping=True, filenameprefix=None)[source]

Train the model on the dataset. max_epochs is the maximum number of epochs to train, but if early_stopping is enabled, it could be fewer. If early_stopping is True, then a default strategy is used where training stops after the validation accuracy did not improve for 2 epochs. If set to a function that function (which must accept a standard set of parameters and return a boolean) is used. TODO: check if config should be used by default for the batch_size etc here!

gatelfpytorchjson.modelwrapperdefault.f(value)[source]

Format a float value to have 3 digits after the decimal point