The schematic overview of the BUSGen pretraining and adaptation framework. a, Over 3.5 million breast ultrasound images of 5,907 examinations of 4,636 patients and 3,749 lesions were collected. These data were annotated by clinical experts and were used for the conditional generation task to pretrain the BUSGen model, enabling it to learn rich data distribution and generate high-quality images through an iterative refinement process repeated T times. The pretraining task incorporated conditions of the labels of pathology, lesion box, and device type. b, The pretrained BUSGen can be adapted to various downstream tasks, generating unlimited, informative data resources and facilitating the development of downstream models. To preserve the rich information acquired during pretraining, we froze the pretrained parameters and fine-tuned low-rank adapters (LoRA). In comparison to baseline models, the BUSGen-based downstream models (BUS-DMs) achieved superior performance in a wide range of tasks across breast cancer screening, diagnosis and prognosis.