Open source Data Lake Mgmt, Curation, and Governance for New&Growing Companies

Wednesday 25 January 2023

This event has finished.

Started 17:00 PM

Finished 18:00 PM

Organized by ODSC Lisbon Data Science

Venue: Online/Virtual

Address: Online event on your device
8600 Lisboa

See other Online events

About this event

**To access this webinar, please register here:** [](

**Topic:** “Open source Data Lake Management, Curation, and Governance for New and Growing Companies”

**Speaker:** Arjuna Chala\, Associate Vice President \| HPCC Systems and Special Projects

With almost 25 years of experience in software design, Arjuna is responsible for HPCC Systems evangelism by helping customer implementations and engagement. Arjuna has a passion for data analytics innovation around healthcare, fintech, cryptocurrency, and smart devices. Dedicated to development excellence, Arjuna served as a team member to bring the HPCC Systems platform to the open-source community. In his work with HPCC Systems community leaders and startup incubators, Arjuna’s efforts have contributed to the spread of HPCC Systems technology into the enterprise domestically as well as in the international markets of China, Brazil, Europe, and India. Arjuna has a BS in Computer Science from RVCE, Bangalore University


Data Lake Technology provides a powerful way to process, refine, and present huge volumes of diverse data. But the variety of technologies available presents unique challenges for start-ups and rapidly-growing companies and these large volumes of data come at a cost. As a Data Lake evolves, it grows in size and complexity. If not properly managed, a Data Lake can outgrow the abilities and resources of the team that manages it, negatively impacting the usefulness of an organization’s data and slowing or halting the team’s implementation of new analytics and applications.

**In this talk,** Arjuna Chala, showcases how the completely free and open source HPCC Systems Data Lake platform has developed powerful storage and compute capabilities able to manage massive quantities of data as well as an open source data curation and governance system called Tombolo.

**Arjuna will discuss new and growing company platform case studies as well as how the system enables you to:**

*\- Achieve better performance\, near real\-time results and full\-spectrum operational scale — without a massive development team\, unnecessary add\-ons or increased processing costs\.*

*\- Curate data – the ability to automatically identify and classify a data file*

*\- Govern sensitive data – automatically identify sensitive data files\, apply any necessary usage restrictions to that data*

*\- Keep accurate records \- of who\, how\, and when a user or application interacts with a sensitive data file*

*\- Embed languages & integrate third party tools\- including Spark\, Mongo DB\, Cassandra\, Python and many more*

**ODSC Links:**

• Get free access to more talks/trainings like this at Ai+ Training platform:


• Facebook: [](

• Twitter: []( & @odsc

• LinkedIn: [](

• Slack Channel: [](

• ODSC East 2023 May 9-11th - [](

• Code of conduct: [](

This page last updated Wednesday 25 January 2023 at 02:45.

Problems? Report an error or inappropriate listing here.

Information displayed here is provided in good faith but we are not responsible for the content of any listing. Sometimes events can be cancelled or changed at short notice. Please check with the venue or organizer before you travel!

Oh no. Javascript is switched off in your browser.
Some bits of this website may not work unless you switch it on.