On-device large model optimization

Leveraging hybrid quantization and forward prediction techniques, this technology reduces the size of large models while accelerating inference speed on devices.

How it works

Hybrid quantization: Customized quantization solutions are designed based on the characteristics of different parts of the model, forming a hybrid quantization solution that minimizes the model size while maintaining accuracy.

Forward prediction: Parallel prediction schemes tailored to specific LLM characteristics enable the system to predict multiple future tokens at once, significantly improving inference speed and efficiency.

User perception

+ Shorter response time + Smaller memory footprint

+ Faster generation + Reduced power consumption

*Data sourced from Lenovo Labs. Feature performance is for reference only and actual experience may vary.

Data security

Technology such as database encryption, sensitive word filtering and large model encryption are used to effectively safeguard user privacy and data security.

How it works

+ Database encryption technology:
advanced encryption technology effectively prevents data from being accessed by unauthorized third parties during the data storage process.

+ Sensitive word filtering technology:
Deep learning-based algorithms extract features from text to automatically detect and block inappropriate content, ensuring compliance and maintaining a safe environment.

+ Large model encryption technology.

User perception

Users perform content processing on the device to effectively safeguard their privacy and data security.

*Data sourced from Lenovo Labs. Feature performance is for reference only and actual experience may vary.