Skip to Main Content

Text and Data Mining

About this guide

This guide provides information to support NTU students and researchers who are considering using Text and Data Mining (TDM) in their projects.

For feedback and suggestions, please

What is Text and Data Mining

There are many definitions out there. In essence, Text and Data Mining (TDM) refers to the use of automated tools to analyse large amounts of text and data to uncover patterns, trends, and relationships. This process involves extracting data from various sources, including books, journal articles, social media posts, and other digital content. The goal is to generate new insights and knowledge that may not be immediately apparent through traditional research methods.

(Paraphrased from IFLA Statement on Text and Data Mining, 2013)

Permissible TDM Use under Singapore's Copyright Law

In the Singapore Copyright Act 2021, text and data mining (TDM) falls under the section on "Computational Data Analysis" (CDA) and is defined as:

(a) using a computer program to identify, extract, and analyse information or data from the work or recording; and

(b) using the work or recording as an example of a type of information or data to improve the functioning of a computer program in relation to that type of information or data.

The Act provides a legal framework under which Computational Data Analysis is permitted. Copies of copyrighted works or recordings of protected performances can be made for the purpose of TDM under specific conditions.

✅ What You Can Do:

  • Use legally accessible content – You can perform TDM on materials you have lawful access to, such as through NTU Library’s subscriptions, open-access sources, or legally purchased content.
  • Make copies for TDM – You are allowed to copy, extract, and store content only for the purpose of TDM.
  • Share results in limited cases – You can share copies with others ONLY for:
    • Verification of results of the analysis
    • Collaborative research or study related to the analysis

🚫 What You Cannot Do:

  • Bypass paywalls or terms of use – If you gain access to the content by circumventing paywalls or violating database agreements, TDM is not permitted.
  • Use infringing copies – If the source material is an illegal copy, you cannot use it for TDM (unless you were unaware of the infringement).
  • Repurpose the data for other uses – Any copies made for TDM cannot be used for other purposes outside the research and must be deleted once the research is completed.

📝 Key Takeaways for Library Users:

  • If you are planning to perform TDM with copyrighted resources, do check the publishers' website for the terms of use and obtain written permission, if necessary.
  • Ensure that your access to the resource is lawful, such as through NTU Library’s subscriptions.
  • If you are collaborating with others, only share copies for verification or joint research.
  • Always delete the copies once the research is completed.

Can I use TDM for my research?