Skip to Main Content

Data Science: Datasets

This guide will help Data Science students find information they can use for their studies

How to find datasets

 

Datasets are:

  • most often produced by government agencies, researchers or non-profit organizations
  • located by identifying the agency or organization that focuses on a specific research area of interest

For example, if you are interested in learning about various Australian industries, IbisWorld would be a good place to look, or for data on population, the Australian Bureau of Statistics.

Using Google to find datasets

Google Dataset Search looks for datasets in thousands of repositories across the web.

It is useful in searching for a broad spectrum of data, such as scientific data, government data, and data provided by news organizations.

Note: Some datasets may be behind a paywall or require a fee for you to download them.

If so, search for the source's name in the UTS Library catalogue to see if we subscribe to it (eg: Statista). If we do, travel to the database and look up the title of the dataset you found.

In Google, to find open data on a country or state, search using the keywords: open data + the name of a country/ state.

Google Open Data search

A list of general datasets

 

Evaluating datasets

 

Steps to verify datasets

Apply the same evaluation tools to verifying datasets as you would scholarly information. 

  1. Who collected it?
    • Examine the explanatory notes of the data
  2. What are the credentials of the data producer?
    • Are they an expert in the field?
  3. Who sponsored the data collection?
  4. When was the data collected?
    • Are these the newest figures in the field?
  5. Who was included in the data and who was excluded?
    • Is the data biased or representative of all factions?
  6. Do other sources provide similar findings? 
    • Find one/two other sources to support the findings