Skip to Main Content

Artificial Intelligence and Large Language Models

An overview of ChatGPT and other Large Language models: concerns, uses, citing them

Concerns about AI

Artificial intelligence can be a helpful tool in the information literacy toolbox for topic brainstorming, writing refinement, and cross-disciplinary perspectives. Generative AI could also give first-generation college students a chance to compete on an equal footing in culturally biased environments, as they are often more resourceful and creative in their approach to academics and campus life. However, there are known issues with ChatGPT’s accuracy and reliability. It can appear to researchers as an authority on any given subject by providing citations to sources that do not exist or are not relevant to the topic being discussed. This can make it difficult for researchers to find accurate information and can lead to them making incorrect conclusions. Unfortunately, ChatGPT, like a large percentage of Artificial Intelligence, also has well-documented issues of bias and misinformation. This is because AI systems are trained on data that is often biased, and this bias can be reflected in the output of the system. For example, if an AI system is trained on a dataset of news articles that are mostly from one political perspective, the system may be more likely to generate text that reflects that perspective. Additionally, AI systems can be easily manipulated to generate false or misleading information. For example, an attacker could create a fake news article and then use an AI system to generate text that supports the article. This text could then be spread online, making it difficult to distinguish between real and fake news. In some cases, an overcorrection to combat bias has led to these communities being left out of the conversation completely. It is important to be aware of the potential for bias and misinformation when using AI systems and to take steps to mitigate these risks. We look forward to working with any of you that are interested in integrating AI to develop best practices.

More Copyright ...

 

The input to generative AI (training data) - Should it be considered fair use? This is widely debated.

Argument A. No it's copyright violation
OpenAI Sued for Using Everybody's Writing to Train AI - "The class action suit, filed in California, alleges that failing to follow proper procurement guidelines, including seeking the consent of those who produced that content in the first place, amounts to straight-up data theft."

This will affect not only OpenAI, but Google, Microsoft, and Meta, since they all use similar methods to train their models.
 

Argument B. Yes, it's fair use
Thoughts from Creative Commons:
Fair Use: Training Generative AI - Stephen Wolfson
Better Sharing for Generative AI - Catherine Stilher

Thoughts from UC Berkeley Library Copyright Office
UC Berkeley Library to Copyright Office: Protect fair uses in AI training for research and education

Thoughts from EFF: Electronic Frontier Foundation:
AI Art Generators and the Online Image Market - Katharine Trendacosta and Cory Doctorow
How We Think About Copyright and AI Art - Kit Walsh
“Done right, copyright law is supposed to encourage new creativity. Stretching it to outlaw tools like AI image generators—or to effectively put them in the exclusive hands of powerful economic actors who already use that economic muscle to squeeze creators—would have the opposite effect.”

Other countries