Structured Vs Unstructured Data |

Поделиться
HTML-код
  • Опубликовано: 12 дек 2024

Комментарии • 8

  • @TechcanvassAcademy
    @TechcanvassAcademy  3 года назад

    Join Today! CBDA certification training bit.ly/42pM8mQ
    To know more about us - Contact us at +91 93255 66777 or visit us at techcanvass.com

  • @sagarpaudel5904
    @sagarpaudel5904 11 месяцев назад

    Thank you so much for the clear information.

  • @krishnareddy3621
    @krishnareddy3621 3 года назад

    good information..

  • @arbindkumarram6904
    @arbindkumarram6904 Год назад

    Mujhe ek bar Didi... Basic features of big data, importance of big data, the three parameters of big data, the three important milestones in the history of big data ,data features and sources of data samjha dijiye... ✌ main class 10th main hun... To Uske according sare above mentioned sub topics of big data samjha do....

  • @darkamantra21
    @darkamantra21 2 года назад +1

    What to do if you have those data but no common identities how would relate it each other?

    • @TechcanvassAcademy
      @TechcanvassAcademy  2 года назад

      We ideally work with data connected in some logical way. For example, working with different products and customer reviews. We may or may not find the exact product information from the customer reviews. We will need to extract information from the unstructured review text to connect it to the other product-related data. Below are a few common ways to link such data:
      Linking the data by means of a common value: For example, we can extract the product name (APT9812-389) referenced in customer reviews and use it to search in the inventory catalog. But this linkage might or might not reflect reality; there might not be an actual linkage at all. Also, the linkage may not be correct in the case of different short forms used in different datasets. But multiple fields can be used for a stronger match and higher probability of being correct, like the product name along with the cost/description.
      Linking the data based on frequently occurring common identifiers: Suppose many resumes need to be analyzed. The system reads each resume and picks off commonly identified data elements from the resume like name, address, telephone numbers, college attended, etc. These identifiers can then be joined independently with any other data like college can be joined with the curriculum offered by that college. It is common to have many matches in this type of join.
      So, in short, if there are no explicit common identities present, we will have to extract specific value(s) to search/join.