Massive Scale Entity Resolution Using the Power of Apache Spark and Graph
HTML-код
- Опубликовано: 5 май 2019
- Spark's graph capabilities are great at enabling analysis of networks for use-cases such as fraud-detection, illicit network detection, and supply chain risk analysis. However, in order for a data scientist to perform analytics on a network (e.g., Page Rank, community detection, etc.), they end up spending all their time fighting a mountain of data integration challenges. A specific challenge this talk will focus on is connecting entities in a network within and across data domains. We will explore how you can leverage the Spark ecosystem's graph capabilities to perform massive-scale entity resolution (ER). As a result, your data scientists will be able to more quickly and effectively perform graph analytics that drive business and mission value. Key takeaways: 1) The Spark ecosystem enables you to quickly get started with graph analytics use-cases at scale 2) Complementing traditional ER techniques with the context of graph relationships allows you to connect entities that you could not easily connect before
About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here: databricks.com...
Connect with us:
Website: databricks.com
Facebook: / databricksinc
Twitter: / databricks
LinkedIn: / databricks
Instagram: / databricksinc Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. databricks.com... Наука
Great talk!