How Patsnap used GPT-2 inference on Amazon SageMaker with low latency and cost
AWS Machine Learning Blog
JULY 24, 2023
This blog post was co-authored, and includes an introduction, by Zilong Bai, senior natural language processing engineer at Patsnap. add( "input_ids", min=(1, 1), opt=(batch_size, max_sequence_length // 2), max=(batch_size, max_sequence_length), )] gpt2_engine = GPT2ONNXFile(onnx_path, metadata).as_trt_engine(output_fpath=trt_path,
Let's personalize your content