Remove author min-yang
article thumbnail

How Patsnap used GPT-2 inference on Amazon SageMaker with low latency and cost

AWS Machine Learning Blog

This blog post was co-authored, and includes an introduction, by Zilong Bai, senior natural language processing engineer at Patsnap. add( "input_ids", min=(1, 1), opt=(batch_size, max_sequence_length // 2), max=(batch_size, max_sequence_length), )] gpt2_engine = GPT2ONNXFile(onnx_path, metadata).as_trt_engine(output_fpath=trt_path,

AWS 66