20 June 2023
AI companies train their models on huge amounts of data. Some of this data is copyrighted. There is an ongoing debate on what this means for IP. Some believe that using existing works to train LLMs counts as fair use. If the work an LLM outputs is very different to the works it trained on, how is this different from a human writer?
The other point of view sees a clear copyright violation. The LLM would not be able to output anything without training on the source works.
Another position is there is a case for compensating owners of content. Although there might be no clear copyright violation, LLMs owe a debt to the material they trained on.
The debate will likely rage for some years.
Tutello does not train AI models with your content. Instead, we prepare your content to work well with prompt engineering techniques. Prompting is not training. We send small sections of content (usually a sentence or two) along with student questions to the open ai api. This technique can give amazing results. Open ai commit that they do not use data submitted to this API for training:
Using Tutello you can give your students all the advantages of LLMs without losing control of your IP. Only the students you choose will have access to the content you add to Tutello.
You can track which parts of your content is being used for prompting. This gives you the benefit of homing in on areas of content that might be difficult for students.
You can remove content from our system whenever you choose.
If you would like to find out more about how Tutello can help you make the most of your content, please get in touch.