“It’s the variety, stupid”

Well, I should have suspected it. But it was good to see more than 40 experts from around the world highlighting and explaining it: the special thing about big data in agriculture is its extreme variety.

This is what you get, if you contrast the four (as IBM suggests) V’s of big data to the data types and sources that are typically used in agricultural, food and environmental research. We are not talking about an extremely large Volume; other domains have much more voluminous data. It is not that they come with a high Velocity, especially compared to other domains. In many cases, their Veracity is quite high. But in agriculture, data Variety matters: you need to combine multiple, heterogeneous data types and formats from several sources, trying to solve the information problems and support decision making of the relevant stakeholders.

This was the main message that came out of the workshop on “Big data for food, agriculture and forestry: opportunities and challenges” that Agro-Know, FAO and the Big Data Europe project organised last week in Paris. And it was a fully packed workshop, with over 40 participants that use a large variety of data sources, types and applications – ranging from germplasm passport descriptors to animal disease outbreaks.

Our speakers were representing some world-class research institutions. The workshop opened with Pascal Neuveu, senior research engineer at the French National Agronomic Institute (INRA) and director of the MISTEA Laboratory. Pascal gave us the big data perspective and implementation challenges that such a large and distributed agronomic research organisation has – did you know that INRA has 18 different centers spread across over 40 geographical locations, and more than 10,000 people from which about 4,000 are researchers? Pascal also presented a specific case study, the one of high throughput phenotyping at 5 large, open-air experimental fields, 2 greenhouses with controlled environments, and 2 fully equipped laboratories for carrying different types of omics analyses. Lots and lots of data generated and managed, but the main challenge that he identified is the extreme variety of this data.

The Dutch were also there, with two of the most prestigious Wageningen UR institutes presenting their views and needs in terms of big data:

Tim Verhaart from the socio-economic research institute of LEI talked about “Big data opportunities for marketing of horticultural products”, introducing a very interesting public-private partnership through which the fruits and flowers industries are calling on big data technology researchers to help them do business better.
Rob Lokers from the inter-disciplinary environmental research institute of Alterra brought forward the amazing complexity of using big data as input for complex agricultural and environmental modelling, which is then generating new (big) data, information, and knowledge that supports research and policy making – at his talk on “Big data challenges and solutions in agricultural and environmental research“.

And there were more people representing our community – like Elizabeth Arnaud from Bioversity International giving an excellent insight into the CGIAR Big Data Analytics Platform (that we have introduced at a previous blog post); and Valeria Pesce from the Global Forum on Agricultural Research (GFAR) who presented where we stand today with a global linked and open data infrastructure for agricultural development.

At the end of the workshop came one of the most provoking interventions from Stefano Bertolo of the European Commission: “I have an open question to share: is there some European company that will surprise us by disrupting the agricultural industry and help it make a competitive breakthrough using big data?”

Overall, an extremely interesting event that initiated the dialogue in Europe about how big data will change agricultural research and put in place a framework for a more coordinated effort. A series of follow up actions, like dedicated webinars and sector-specific discussions, will come next. And we should keep our eyes open for the first deployment of an open source big data platform that our community could play with, at the end of 2015. Something like a Christmas present, right?

“It’s the variety, stupid”

Share The Food Safety Intelligence Blog!