Is Microsoft providing lineage information on the source of the training data?
OpenAI has trained the model on publicly available data over a specified period of time that is not trademarked. Microsoft has adjusted some of the weights to make them a better fit in the enterprise environment.
Microsoft 365 Copilot does not use customer data—including prompts—to train or improve Microsoft’s large language models (LLMs). They believe the client’s data is their data. So, the existing Microsoft guarantees that the company has always made for enterprise and commercial data to persist and continue, even in this AI era. You can review Microsoft’s privacy policy and service documentation for more information at http://aka.ms/privacy.