2 Good and 2 Bad Things About Google’s New Data Journalism Feature
Researching for an article, though necessary, is tedious and at times mind-numbing. Google's going to help out with a new data journalism feature.
Researching for articles is an integral part of my job. Searching through data and data is key to providing quality content. If I skip and try to pass off my work as accurate, I’m opening myself up for a major headache. And I could possibly lose my job. Despite hating research, I’ve come to love it. Which is why I grit my teeth when it comes to finding the right data, then identifying which datasets to focus on, and after all that is finally completed, digging into the data to analyze it. Along with improving grammar in their Docs, Google’s going to make research easier too.
How Google’s Going To Help With Data JournalismGrowing up my father would wake up at 4 am to go out into the wilderness and conduct research studies. He wasn’t researching wolves or bears or even something interesting. He was researching elk…elk. Just let that sink in for a minute. Majestic though they may be, these things don’t do much beyond graze and walk around. At least, they didn’t when I was looking at them. Since my dad was a wildlife biologist, I got the “opportunity” to go out on these trips. Sometimes if I misbehaved my dad decided I needed another “opportunity” to go out and help with the research. After staring at these dumb animals for untold hours, we’d go home and he’d start calculating the data. It took him more than twenty years to complete the research and analysis before he felt he had enough to present his findings. Thank god we have Google now. With a decent ISP, Google’s at my beck and call. Make sure you have taken the time to look through the best cable and internet packages in your area to cut down on twenty years of research. Without a reliable connection, I don’t think I’d be able to finish a single article. Thankfully, I do. And with it, I can easily type in a term and Google will return the relevant information. At least, in theory, that’s what it should do.
1. Save TimeI still dig through the information available there and suddenly I’m back in the field with my dad staring at a dumb animal that’s just chewing. What Google is proposing is to highlight relevant data within articles and list them above the title. A sort of preview. Already, I can feel the weight of researching getting lighter. By seeing what data is contained with an article or a report right there on the Google search page, I could save a significant amount of time. How much time I would save is still undefined. To figure that out would probably take a good twenty years anyway. This wouldn’t make research a blissful experience, it would just ease some of the headaches of hunting for data. I’m okay with that. Because there’s nothing more frustrating than opening an article and reading through a considerable chunk of it only to find it’s irrelevant. While the mistake of reading it was mine, it would have helped to get a better picture of the data contained within before I even started reading. And data previews would be a huge help in accomplishing that.
2. Refine Search CriteriaI’ll be honest, I have no master’s degree and I didn’t excel in school when it came to research. I got by well enough though. Now, when it comes to research for an article, I start by guessing at the search terms I need to use. There are the few times when I have a clear idea of where to look. Other times, and it happens more than I would like, I shoot in the dark until I find the right combination of words and terms. This works well enough most of the time. The other day, however, I had to dig through stuff from the FCC. Never in my life have I been so frustrated trying to find the relevant datasets. There was plenty of data to look at, I just didn’t have a clue what most of it meant. They use a lot of numbers.
Possible DrawbacksGetting data previewed will be a huge help.
1. But humans will be looking at the previewsBy reviewing the data alone, information can get missed. When just the numbers and the related terms are pulled out, data can be misunderstood. With no context, sometimes we can read the data in the wrong way and draw the wrong conclusions. This may not be as bad as it seems. Professionals do research and still misinterpret data from time to time.
2. Dense ReportsThe real concern is when there is a huge report. You know, the academic kind with stuffy language. The type of report where the abstract alone hurts the brain while its being read. These dense tomes of collected data and aggregated information may defeat the algorithm of Google’s search engine. I doubt even artificial intelligence could make sense of them.
Get Ready For It NowGoogle’s developers have already prepared for this. And they are asking that published articles are prepared in such a way that data is easy to identify. As Google searches far and wide through the internet it’ll be able to pluck the right stuff out of the text if it’s been easily labeled. There’s guidelines, source and provenance best practices listed in the developer's announcement. Before all that, there’s a list of examples for how authors and journalists can prepare their data so Google’s algorithms will recognize it.
- A table or a CSV file with some data
- An organized collection of tables
- A file in a proprietary format that contains data
- A collection of files that together constitute some meaningful dataset
- A structured object with data in some other format that you might want to load into a special tool for processing
- Images capturing data
- Files relating to machine learning, such as trained parameters or neural network structure definitions
- Anything that looks like a dataset.