BigDataStack - a holistic approach to the big data problem

11 Feb 2020

In January 2020 BigdataStack entered its third and final year, in which the project will have more results for dissemination, exploitation, standardisation and uptake in open source communities. 

The F2F Milan meeting in January kicked off this third and final year. On 27-29 January, all 14 partners discussed the main results achieved, specifications for further optimisation and needs for the 3 use cases in which the BigDataStack solution has been deployed.
A fruitful meeting, intended to dive into the specifications of big data problems related to real-time monitoring, policy, analysis and decision making.

Trust-IT's role in BigDataStack

Exploitation of the results becomes crucial during the last year of the project as it opens new possibilities of adoption for different stakeholders.
BigDataStack is built upon its 18 software components and most of them will be released in open source to let application developers and other end-users take advantage of the solution provided.
To better disseminate the outcomes of the project, a catalogue listing all of them has been set up and published on the website. The catalogue is provided with additional descriptive material such as factsheets and descriptions, and video pills of the developers describing the main functionalities of each component will soon be released and linked to them.


The data-driven society, a bit of context

It is a fact, that we are living in the "data-driven society", where the amount of data produced is astonishing. We’re talking about 1.7 Mb of data per minute per person, an amount that you’d need to store in a brand new, empty 1-terabyte hard disk each year (estimated cost ~50€) if nobody was interested in.

But why should somebody pay for your data?
Consider this, the supermarket where you usually do your shopping, the one around the corner, can learn your habits through the week thanks to a well-structured bigdata infrastructure. Therefore they know that every Friday of the month, at around 6 PM you go there and buy tomatoes. Probably it’s the time you usually get out of work and do your shopping for the weekend. They will never run out of tomatoes and you will never change the supermarket because of the lack of them at the time you need them.
Won’t it be enough for the supermarket to pay for your data? All of this is not a sci-fi story, it’s what is actually happening inside one of our use cases and it’s just the smallest bit.
Problems like fleet management or risk assessment and mitigation will be tackled thanks to advanced machine learning techniques, bringing the power of computational science to the service of industry and society. BigDataStack, through real-time prediction processes, will support enterprises, making them capable of data-driven, optimised decision-making that will enhance their businesses. In addition, the platform being domain agnostic, it can be applied virtually to any scenario.

Highlights of Year 2

Many topics were addressed during the Milan meeting regarding scalability and development, and the consortium took stock of the main activities and results of the first two years.
The most important is the delivery of a demo version of the platform at the midterm review meeting (July 2019) during which all the functionalities of the tool were showcased and described. The platform is the result of a very intense research activity for which a number of academic publications have been published a lot of benchmark analysis has been run and, last but not least, many events have been attended.
Additionally, a community of more 600 individuals has been developed on Twitter and Linkedin and, thanks to a storytelling approach, these people got constant updates on the results and main activities of the project.

The European Open Source Initiative

RedHat, the world's leading provider of enterprise open source solutions, leads the BigDataStack work on the European Open Source Initiative (for reference, visit opensource.org and cncf.io/sandbox-projects). BigDataStack contributes to the Open Source Initiative by releasing its source code in the form of distinct software components and in addition RedHat will guide the consortium to collaborate with other open-source projects for upstream uptake the BigDataStack software components, facilitating the exploitation of open-source artefacts produced.

Next steps

Although a lot has been done, discussed and agreed upon, in the third year the project will define a roadmap for possible exploitation in sectors different from the use-case sectors, retail, insurance and shipping. The adoption of the BigDataStack solution in domains like healthcare has to cope with sensible and personal data handling, requiring particular attention to keep the compliance required by European policies and regulations, first and foremost GDPR.
Lowering the barriers of data usage, though, still remains one of the main objectives of the project and additional discussion and investigation will be addressed.

Curious about our project? Visit bigdatastack.eu!

Discover our software components, the people involved and our use cases.

 

Publication date: 11 Feb 2020