fcento
fcento
  • Видео 3
  • Просмотров 96 439
Parsing XML with Namespaces with Python (xml.etree.ElementTree)
Example of approach for XML data with Namespaces using Python xml.etree.ElementTree
Reference: docs.python.org/3/library/xml.etree.elementtree.html
Previous video, overview of the Python xlm.etree.ElementTree module: ruclips.net/video/bWfAD7wAfOI/видео.html
Просмотров: 19 241

Видео

Fixing Missing Element Tags in XML with Python (xml.etree.ElementTree)Fixing Missing Element Tags in XML with Python (xml.etree.ElementTree)
Fixing Missing Element Tags in XML with Python (xml.etree.ElementTree)
Просмотров 4,1 тыс.3 года назад
Example of approach for XML data which is missing tags in elements using Python xml.etree.ElementTree Reference: docs.python.org/3/library/xml.etree.elementtree.html Previous video, overview of the Python xlm.etree.ElementTree module: ruclips.net/video/bWfAD7wAfOI/видео.html Next video, XML with Namespaces: ruclips.net/video/aB_koPUNqfo/видео.html
Parsing XML files with Python (xml.etree.ElementTree)Parsing XML files with Python (xml.etree.ElementTree)
Parsing XML files with Python (xml.etree.ElementTree)
Просмотров 73 тыс.4 года назад
Overview of the Python xlm.etree.ElementTree module for parsing and editing and creating XML files. Reference: docs.python.org/3/library/xml.etree.elementtree.html Next video of the series covering a special cases such as elements with missing tags: ruclips.net/video/5BrVPpOifto/видео.html

Комментарии

  • @nk461
    @nk461 11 дней назад

    What the font bro? 😅 Can't read, I think I am too old

    • @fcento
      @fcento 11 дней назад

      😅

  • @debasishsahoo1268
    @debasishsahoo1268 Месяц назад

    Awesome

  • @jdvelasquezr
    @jdvelasquezr 2 месяца назад

    Thank you, Francesco, for taking the time to review this library's different functions. You have greatly helped me finish a much-needed script for our localization engineering tasks. Notably, adding text to an existing tag saved the day.

  • @ShivModiShankar
    @ShivModiShankar 5 месяцев назад

    Thanks for saving my day Francesco :)

  • @attilioturco
    @attilioturco 5 месяцев назад

    nice vid thanks

  • @AnEngineeringGirl
    @AnEngineeringGirl 6 месяцев назад

    After editing the xml file, I don't want the ns tag in each line. What should I do?

  • @hoof-hearted-2024
    @hoof-hearted-2024 6 месяцев назад

    Can you show us how to parse a Tableau dashboard file (*.twb)? It's an XML file, Tableau just renamed it. I am trying to create a data dictionary from the .twb file.

  • @saranya548
    @saranya548 9 месяцев назад

    Thank you Francesco for explaining this concept so easily with a demo.

  • @user-xu8od8tf9l
    @user-xu8od8tf9l 10 месяцев назад

    Thanks a lot for the great tutorial. Your approach to XML parsing was spot-on for me and it was exactly what I was looking for to get started on XML parsing.

  • @RodrigoMontes
    @RodrigoMontes Год назад

    Excellent man! This is what I was looking for :)

  • @equipagescatamaranlangrune8331

    Nice job Francesco, thank you. I do have a question regarding your last example. How do you get x.tag without the namespace in it ?

  • @bayrakmusti1
    @bayrakmusti1 Год назад

    That's how it is supposed to be taught. I have been browsing the courses on how to do it and they all are complicated. Thankfully found this video. Thanks a lot. Great job!

  • @nealrutgerskid
    @nealrutgerskid Год назад

    thank you

  • @5328csabi
    @5328csabi Год назад

    Tried to find a solutuion on stackoverflow and other pages, and this video was the solution, good examples, good explanation with actual codes run. Thank you!

  • @stevemorse5052
    @stevemorse5052 Год назад

    Francesco, thank you. Thanks to you I now somewhat understand what is happening in the XML file I have, I did not know that is contained name spaces. Within the first 6 minutes, I am writing this before I have read the comments or watched the whole video. You have saved me hours of coding as I was going to write my own parser. All the other video sorta, kinds, neglected this small detail! Now after reading the comments, I see you have helped a LOT of people, thank you again.

  • @narayanamurthyuppala6049
    @narayanamurthyuppala6049 Год назад

    Thanks a lot. It saved lot of my time

  • @aryan6536
    @aryan6536 Год назад

    Have you stopped created videos?

    • @fcento
      @fcento Год назад

      Is there anything in particular you would like to see? Been thinking to possibly do a video on CuPy

  • @jezhayes
    @jezhayes Год назад

    Thank yopu so much, I was beginning to think I was cursed to manually write a text parser for these xml files forever.

  • @davidjnevin
    @davidjnevin Год назад

    Really excellent explanation. Thank you so much.

  • @maloman1989
    @maloman1989 2 года назад

    Really cristal clear tutorial, I understand a lot of things I dindn't understand on XML namespaces, Thanks a lot Guy!

  • @giacomocillari4448
    @giacomocillari4448 2 года назад

    Is there a way to change sub-element instead of the whole element string? let's say for example that I want to change W with SW but not the name, and I need to do it in a loop so I can't put the name string inside as it changes anytime, is there a way to call the specific sub element?

  • @markdillon9588
    @markdillon9588 2 года назад

    can you mass edit multiple files?

  • @vvtwins4kidz
    @vvtwins4kidz 2 года назад

    this code is specific to specific xml, the code should be generic for any XML, if tags are missing it should add the missing tags

    • @aryan6536
      @aryan6536 Год назад

      Surely you can use this information now to achieve many things, are you having any issues? Please share someone might be able ot help.

  • @skillbuilder138
    @skillbuilder138 2 года назад

    Hi, How to write the content of etree.dump to an xml file?

  • @UsmanSaadat
    @UsmanSaadat 2 года назад

    Thanks a lot for this video. I couldn't grasp the concepts properly even after reading from books. This video made it look like piece of cake.

  • @vijayalakshmi8282
    @vijayalakshmi8282 2 года назад

    hii franseco great video thanks i need small suggestion here let's saya <KTOPL> 100</KTOPL> so in this i need output like KTOPL 100 here i need tag and value both how we can get can u please explian

  • @A_A7337
    @A_A7337 2 года назад

    Great video. Thanks

  • @sidjjj
    @sidjjj 2 года назад

    Thanks for this video, I needed to parse xml from a variable instead of a file and found this : xml_data_tree = ET.fromstring(received_packet)

  • @CinemagicMindset
    @CinemagicMindset 2 года назад

    Hi Francesco, i'm getting error while parsing xml file since it is having special words. kindly hep me to avoid this error. Error : xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 277, column 366

    • @fcento
      @fcento 2 года назад

      If you are sure the file you have is a valid xml (there are online tools to help you there), then what comes to mind is incorrect encoding. Check the documentation here: docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.XMLParser

  • @Gamer-mg6my
    @Gamer-mg6my 2 года назад

    Hi i'm trying to get the text of every tag named <Text></Text>, but inside every tag has this: <pp IX='0'/><cp IX='0'/>, some idea to extract/ the content of the tags?: <?xml version='1.0' encoding='utf-8'?> <VisioDocument xmlns='urn:schemas-microsoft-com:office:visio'> <Colors> <ColorEntry IX='0' RGB='#000000'/> <ColorEntry IX='1' RGB='#FFFFFF'/> </Colors> <Fonts> <FontEntry ID='0' Unicode='0' Weight='0' Attributes='23040' CharSet='0' PitchAndFamily='18' Name='monospace'/> </Fonts> <StyleSheets> <StyleSheet ID='0' NameU='No Style'> <StyleProp> <EnableFillProps>1</EnableFillProps> <EnableLineProps>1</EnableLineProps> <EnableTextProps>1</EnableTextProps> <HideForApply>0</HideForApply> </StyleProp> <Line> <LineColor>#000000</LineColor> <LinePattern>1</LinePattern> <LineWeight>0.010000</LineWeight> </Line> <Fill> <FillBkgnd>#000000</FillBkgnd> <FillForegnd>#000000</FillForegnd> <FillPattern>1</FillPattern> <ShdwForegnd>#000000</ShdwForegnd> </Fill> <TextBlock> <BottomMargin>0.000000</BottomMargin> <DefaultTabStop>0.590551</DefaultTabStop> <LeftMargin>0.000000</LeftMargin> <RightMargin>0.000000</RightMargin> <TextBkgnd>0</TextBkgnd> <TextBkgndTrans>0.000000</TextBkgndTrans> <TextDirection>0</TextDirection> <TopMargin>0.000000</TopMargin> <VerticalAlign>1</VerticalAlign> </TextBlock> <Char IX='0'> <Color>#000000</Color> <Font>0</Font> <FontScale>1.000000</FontScale> <Size>0.166667</Size> </Char> <Para IX='0'> <BulletFontSize>-1</BulletFontSize> <BulletStr>&amp;#xe000;</BulletStr> <HorzAlign>0</HorzAlign> <SpLine>-1.200000</SpLine> </Para> <Tabs IX='0'/> </StyleSheet> </StyleSheets> <Pages> <Page ID='0'> <PageSheet ID='0'> <PageProps> <PageWidth>1.651575</PageWidth> <PageHeight>0.748031</PageHeight> </PageProps> </PageSheet> <Shapes> <Shape ID='1' Type='Shape' FillStyle='0' LineStyle='0' TextStyle='0' NameU='Polygon.1'> <Data1>0</Data1> <Data2>0</Data2> <Data3>0</Data3> <XForm> <Height>0.708661</Height> <PinX>3.720472</PinX> <PinY>6.023622</PinY> <Width>1.612205</Width> </XForm> <Fill> <FillBkgnd>#000000</FillBkgnd> <FillForegnd>#FFFFFF</FillForegnd> <FillPattern>1</FillPattern> <ShdwForegnd>#000000</ShdwForegnd> </Fill> <Line> <LineCap>1</LineCap> <LineColor>#000000</LineColor> <LinePattern>1</LinePattern> <LineWeight>0.039370</LineWeight> </Line> <Geom IX='0'> <NoFill>0</NoFill> <NoLine>0</NoLine> <NoShow>0</NoShow> <NoSnap>0</NoSnap> <MoveTo IX='1'> <X>0.000000</X> <Y>0.000000</Y> </MoveTo> <LineTo IX='2'> <X>1.612205</X> <Y>0.000000</Y> </LineTo> <LineTo IX='3'> <X>1.612205</X> <Y>-0.708661</Y> </LineTo> <LineTo IX='4'> <X>0.000000</X> <Y>-0.708661</Y> </LineTo> <LineTo IX='5'> <X>0.000000</X> <Y>0.000000</Y> </LineTo> </Geom> </Shape> <Shape ID='2' Type='Shape' FillStyle='0' LineStyle='0' TextStyle='0' NameU='Text.2'> <Data1>0</Data1> <Data2>0</Data2> <Data3>0</Data3> <XForm> <Height>0.247563</Height> <PinX>3.889961</PinX> <PinY>5.511811</PinY> <Width>1.273228</Width> </XForm> <Char IX='0'> <Color>#000000</Color> <Font>0</Font> <FontScale>1.000000</FontScale> <Size>0.247563</Size> </Char> <Para IX='0'> <HorzAlign>1</HorzAlign> </Para> <Text><pp IX='0'/><cp IX='0'/>Entity</Text> </Shape> <Shape ID='1' Type='Shape' FillStyle='0' LineStyle='0' TextStyle='0' NameU='Polygon.1'> <Data1>0</Data1> <Data2>0</Data2> <Data3>0</Data3> <XForm> <Height>0.708661</Height> <PinX>3.720472</PinX> <PinY>6.023622</PinY> <Width>1.612205</Width> </XForm> <Fill> <FillBkgnd>#000000</FillBkgnd> <FillForegnd>#FFFFFF</FillForegnd> <FillPattern>1</FillPattern> <ShdwForegnd>#000000</ShdwForegnd> </Fill> <Line> <LineCap>1</LineCap> <LineColor>#000000</LineColor> <LinePattern>1</LinePattern> <LineWeight>0.039370</LineWeight> </Line> <Geom IX='0'> <NoFill>0</NoFill> <NoLine>0</NoLine> <NoShow>0</NoShow> <NoSnap>0</NoSnap> <MoveTo IX='1'> <X>0.000000</X> <Y>0.000000</Y> </MoveTo> <LineTo IX='2'> <X>1.612205</X> <Y>0.000000</Y> </LineTo> <LineTo IX='3'> <X>1.612205</X> <Y>-0.708661</Y> </LineTo> <LineTo IX='4'> <X>0.000000</X> <Y>-0.708661</Y> </LineTo> <LineTo IX='5'> <X>0.000000</X> <Y>0.000000</Y> </LineTo> </Geom> </Shape> <Shape ID='2' Type='Shape' FillStyle='0' LineStyle='0' TextStyle='0' NameU='Text.2'> <Data1>0</Data1> <Data2>0</Data2> <Data3>0</Data3> <XForm> <Height>0.247563</Height> <PinX>3.889961</PinX> <PinY>5.511811</PinY> <Width>1.273228</Width> </XForm> <Char IX='0'> <Color>#000000</Color> <Font>0</Font> <FontScale>1.000000</FontScale> <Size>0.247563</Size> </Char> <Para IX='0'> <HorzAlign>1</HorzAlign> </Para> <Text><pp IX='0'/><cp IX='0'/>EntityTwo</Text> </Shape> <Shape ID='1' Type='Shape' FillStyle='0' LineStyle='0' TextStyle='0' NameU='Polygon.1'> <Data1>0</Data1> <Data2>0</Data2> <Data3>0</Data3> <XForm> <Height>0.708661</Height> <PinX>3.720472</PinX> <PinY>6.023622</PinY> <Width>1.612205</Width> </XForm> <Fill> <FillBkgnd>#000000</FillBkgnd> <FillForegnd>#FFFFFF</FillForegnd> <FillPattern>1</FillPattern> <ShdwForegnd>#000000</ShdwForegnd> </Fill> <Line> <LineCap>1</LineCap> <LineColor>#000000</LineColor> <LinePattern>1</LinePattern> <LineWeight>0.039370</LineWeight> </Line> <Geom IX='0'> <NoFill>0</NoFill> <NoLine>0</NoLine> <NoShow>0</NoShow> <NoSnap>0</NoSnap> <MoveTo IX='1'> <X>0.000000</X> <Y>0.000000</Y> </MoveTo> <LineTo IX='2'> <X>1.612205</X> <Y>0.000000</Y> </LineTo> <LineTo IX='3'> <X>1.612205</X> <Y>-0.708661</Y> </LineTo> <LineTo IX='4'> <X>0.000000</X> <Y>-0.708661</Y> </LineTo> <LineTo IX='5'> <X>0.000000</X> <Y>0.000000</Y> </LineTo> </Geom> </Shape> <Shape ID='2' Type='Shape' FillStyle='0' LineStyle='0' TextStyle='0' NameU='Text.2'> <Data1>0</Data1> <Data2>0</Data2> <Data3>0</Data3> <XForm> <Height>0.247563</Height> <PinX>3.889961</PinX> <PinY>5.511811</PinY> <Width>1.273228</Width> </XForm> <Char IX='0'> <Color>#000000</Color> <Font>0</Font> <FontScale>1.000000</FontScale> <Size>0.247563</Size> </Char> <Para IX='0'> <HorzAlign>1</HorzAlign> </Para> <Text><pp IX='0'/><cp IX='0'/>EntityThree</Text> </Shape> </Shapes> </Page> </Pages> </VisioDocument>

    • @fcento
      @fcento 2 года назад

      Let's take it in steps. I'm assuming you want to extract 'Entity', 'EntityTwo', 'EntityThree' from the element <Text> (...let me know if i misunderstood your question). The way it's formatted it contains 2 elements (<pp> and <cp>) as well as the piece of text you want to extract. If you just use findall() and use 'text' you get None back, what you want to use in this case is 'tail' instead. I've included a sample code here: gist.github.com/fcento100/74b8691af014a8126f8e9ca2ff03c6ea

    • @fcento
      @fcento 2 года назад

      i've put the xml code from your comment in a file here gist.github.com/fcento100/19cb7ae6b857c539a2c2843519239efc for convenience

    • @Gamer-mg6my
      @Gamer-mg6my 2 года назад

      @@fcento Yes, you understood me good. Ohhhh with tail .Well, i checked it but with other xml didn't compile :( , instead of that i put findall('.//cp', ns) and print elm.tail, with that we got the text. I like more your solution but with other xml didn't compile :(((((.This is the error that i got: elmtail = elm.tail.strip() AttributeError: 'NoneType' object has no attribute 'strip'

    • @fcento
      @fcento 2 года назад

      Apologies for not catching the 'NoneType' error, effectively 'tail' returns None if it doesn't find anything rather than an empty string. It's fixed now in this version: gist.github.com/fcento100/11847ad0d8d42eec6c1dc42de897b842 with an if statement to catch it. The reason i wasn't getting this error was because i copied pasted from your message and since it was formatted, 'tail' returned ' ' and '\t' (which are the string representation of new-line and tab) where it should have returned None, hence why i was able to run the strip command everywhere without error. In the new code i posted I've shown 2 methods of getting at that piece of data; in your sample xml "Entity" etc.. is the tail of <cp>; root.findall('.//visio:Text/',ns) and root.findall('.//visio:cp',ns) do similar things. The only difference is that using './/visio:Text/' in method 1 will also extract the tail for <pp> if is available, which may be undesirable! In that case './/visio:cp' like you suggested is the way to go.

    • @Gamer-mg6my
      @Gamer-mg6my 2 года назад

      @@fcento a lot of thanks for your kind help Francesco :))

  • @Gamer-mg6my
    @Gamer-mg6my 2 года назад

    Hi. I have this xmlns and i tried every solution and i can't read the XML with this namespace: <VisioDocument xmlns='urn:schemas-microsoft-com:office:visio'> ..... </VisioDocument> Some idea to solve this? :/ it is a xml, but the extension file is .vdx

    • @fcento
      @fcento 2 года назад

      Not enough information. What error do you get?

    • @Gamer-mg6my
      @Gamer-mg6my 2 года назад

      @@fcento hi thank you very much for your contributions, don't know why a while ago didn't compile but now it works, finally i do it with the solution that you recommended us on second 14:20

  • @IMMORTALmen
    @IMMORTALmen 2 года назад

    Thank you! Spended almost two days trying to resolve namespace issue, and than found this video, thank you.

  • @stanleymbah8983
    @stanleymbah8983 2 года назад

    thank you for this

  • @vishalkalal22
    @vishalkalal22 2 года назад

    Hi Francesco, In my xml closing tag is missing for example, Current xml - <data> <country name="Liechtenstein"> <rank>1</rank> <year>2008</year> <gdppc>141100</gdppc> <neighbor name="Austria" direction="E"/> <neighbor name="Switzerland" direction="W"/> </country> Expected xml - <data> <country name="Liechtenstein"> <rank>1</rank> <year>2008</year> <gdppc>141100</gdppc> <neighbor name="Austria" direction="E"/> <neighbor name="Switzerland" direction="W"/> </country> </data> Can you please help me with this?

    • @fcento
      @fcento 2 года назад

      If the closing tag is missing, it is technically an invalid file. You may have to manually close it yourself as there is no way for the code to know where it should be closed.

    • @AnilKumar23456
      @AnilKumar23456 2 года назад

      Vishal, you can use regex to find missing closed tag and then just replace it with desired closing tag

  • @myyoutubeaccount0123_
    @myyoutubeaccount0123_ 2 года назад

    thanks a lot

    • @fcento
      @fcento 2 года назад

      Happy to help

  • @kn7298
    @kn7298 2 года назад

    I know it's a bit late, but you can correct the XML layout after adding a new subelement by using the Indent method of ElementTree. This refreshes the automatic indent for the whole tree (or element). You may need to specify "space = ' '" (4 spaces) to override the default setting, to match the default layout for the dump method. ET.indent (tree, space = ' ') # for entire tree ET.indent (elm, space = ' ', level = 1) # for just the element although I did have problems with inconsistent spacing trying to just indent the element!! :O

  • @AlvaroMelchor
    @AlvaroMelchor 2 года назад

    Thanks, it's very clear, most of the tutorials cover only the non namespaces xml

  • @tessdejaeghere6972
    @tessdejaeghere6972 2 года назад

    Super helpful, thanks a lot!

  • @maxnoish
    @maxnoish 2 года назад

    Brilliant !!!

  • @LukasNachtigall
    @LukasNachtigall 2 года назад

    Hey man! You just helped me to finish my parser! I do not know how, but I finally was able to find and change a text in each specific element in my XML file. I still do not understand, why I have to register the namespace, but it did the work perfectly. Your vide was really helpful. I didn't understood 70% of it for the 1st time, but now it's more clear to me. Thanks man!

    • @fcento
      @fcento 2 года назад

      Glad I could help!

  • @shrinivasulunandyala9269
    @shrinivasulunandyala9269 2 года назад

    Merge XML files using python,can you please make video on this top

  • @andrewbourne2296
    @andrewbourne2296 2 года назад

    That was SO helpful Dude. Thank you so much.

  • @rupeshbhuju2897
    @rupeshbhuju2897 2 года назад

    Hi Francesco Cento. I would like to know how to implement below two use cases 1) Incorrect type of data inside an element, for e.g string inside an element that is supposed to have an integer. 2) Missing element: An element that must be present according to XSD is not present in the XML. Could you please suggest any idea on this ? thanks

    • @fcento
      @fcento 2 года назад

      Rupesh, for your use case you may have to refer to a different library called “LXML” which has xml schema (XSD) support. I’m not experienced with this but looking at the documentation it has has an example on how to construct a validator which should address both your issues.

    • @rupeshbhuju2897
      @rupeshbhuju2897 2 года назад

      @@fcento Thank you for your suggestion. I will look into that more.

  • @GuitFishN
    @GuitFishN 2 года назад

    This was a huge help. Thank you!

  • @thomasloia8874
    @thomasloia8874 2 года назад

    Superb, exactly what I needed to know. Thank you

  • @arshap9351
    @arshap9351 3 года назад

    Increase your font size before doing tutorials. its quite complicated to read texts. anyway goodjob

  • @prankurgarg6618
    @prankurgarg6618 3 года назад

    Hi Francesco, Thanks for sharing your knowledge with us. I have a doubt if we want to get a element based on the value of other element , how can we do. I want if attachment is local, print symid. Can you or someone else here please help me here. <?xml version="1.0" standalone="yes" ?> <SymCLI_ML> <Box> <Symm_Info> <symid>000120000369</symid> <attachment>Local</attachment> <model>model1</model> <microcode_version>6079</microcode_version> <cache_megabytes>250880</cache_megabytes> <cache_gigabytes>245.0</cache_gigabytes> <devices>8532</devices> <physical_devices>502</physical_devices> </Symm_Info> </Box> <Box> <Symm_Info> <symid>000120000566</symid> <attachment>Local</attachment> <model>model2</model> <microcode_version>6079</microcode_version> <cache_megabytes>520192</cache_megabytes> <cache_gigabytes>508.0</cache_gigabytes> <devices>10391</devices> <physical_devices>512</physical_devices> </Symm_Info> </Box> <Box> <Symm_Info> <symid>000120000568</symid> <attachment>Local</attachment> <model>model3</model> <microcode_version>6079</microcode_version> <cache_megabytes>165888</cache_megabytes> <cache_gigabytes>162.0</cache_gigabytes> <devices>19444</devices> <physical_devices>880</physical_devices> </Symm_Info> </Box> </SymCLI_ML>

  • @padraigmaccu9333
    @padraigmaccu9333 3 года назад

    Go raibh céad maith agat, a Francesco. Rud a bhí de dhíth orm le fada. Pádraig Mac Con Uladh

  • @KrishnaManohar8021
    @KrishnaManohar8021 3 года назад

    looking forword...

  • @xpanded4806
    @xpanded4806 3 года назад

    Thanks a lot