Thank you very much for this tutorial. Could you please make some videos on using wait and notify processors. It would be very much helpful for us. Thanks in advance!
Hi Steve, excellent series of videos. I have a question, I am trying to merge identically formatted files from multiple sources into one file, the merge content processor recommends there should be only one source like in your demo. Which processor if any do you recommend could handle multiple inputs robustly?
Hi Steven....excellent videos.....Q: the .csv output of this video that gets generated from the MergeContent Processor comes out with the title and the first row in the same line. I followed all the steps you mentioned. I see it comes out the same way for you in the video...but I assume it should be two lines (one for the header, the second one for the first row)...how can I fix this? Thx.
Hello, the same thing happened to me. See how you can solve it, in the MergeContent processor in Header give it a new line 2 and leave it empty, so the content begins to be written from the second line
Hi Steven, how are you? thank you for sharing this tutorial. I would like to ask you something. I have developed a very similar flow process where the MergeContent works fine (1 flowfile in the merged port) when I stop the processor and I let flowfiles accumulated in the input port and then I run the MergeContent. Otherwise, If I run all my flow process, the mergecontent outputs 3 flowfile instead of 1. any ideas? thank you.
@leomax87 If you have you MergeContent setup for 1 max bin and are trying to make sure that you always get 1 merged flow file created. There is a property in the MergeContent processor that is call "Max Bin Age" set this to a minute or to a amount of time that makes since in your flow and this will make it so that the bin must be that old before it's released to create the flow file. That should help you. You can also use a wait/notify setup before the MergeContent to release all the flowfile at once to the MergeContent but that could still give you the same issue you have.
Hi Steven, like the rest of the users I am finding your videos very useful. In my use-case I would like to have a different delimiter (other than a comma) since the data I am pulling in is likely to also contain this character. I am surprised that the AttributesToCSV processor does not have a setting to adjust this - or maybe I am missing something. What are your thoughts on this? Thank you
Thank you Eric great question. I would suggest using the ConvertRecord processor. If you data doesn't start off as JSON then you could use AttributesToJSON -> ConvertRecord. With the ConvertRecord you can use a JSON writer and a CSVRecordSetWriter. With the CSVRecordSetWriter you will have a lot of control over all the properties of creating the csv to include the "Value Separator" which is your delimiter in this case. I hope this help.
Parm, Take a look at "QueryDatabaseTableRecord" processor. When you look at the property list you will see the "Maximum-value Columns" and check out the description on it. It will allow you to provides columns that you want to use to track data changes.
Steven Hello, I have json files in Nifi that in some of the numeric fields in the file, I have a problem that the data contains spaces, which prevents me from uploading them into the Oracle database. In Nifi, how can I delete all spaces in certain fields inside a Json file ?? Thank you so much for all the very interesting and very instructive videos of yours!
Hello, I can think of a couple of ways to do this. 1. Use the ‘ReplaceText’ processor in the flow. This will allow you to setup a search value and replacement value with Regex Replace. 2. You can add the all the key:values in the json to the attributes of the flowfile or just the problematic ones and then use the ‘QueryRecord’ processor with calcite sql to fix the problem with a CAST or TRIM. 3. If you going to do further enrichment to the data in oracle later then you could just write them into Oracle as a varchar or char and cast or convert them later. I hope this help.
@@StevenKoon Hello Steven, first of all thank you very much!! The idea of RepalceText does provide an answer, but it currently removes Space from all fields, and I do not know how to get NIFI to remove only from the relevant fields. To verify terminology: I used JOLT TRANSFORM as you taught in one of your tutorials, and now I have a JSON format object that contains a lot of fields, and a value for each field. There are specific fields where I want to remove the spaces from its values, while removing the spaces from other fields' values will also prevent me from entering the data into the database. Do you know how to make RepalceText work on only those specific fields? Thank you so much Steven!! You're the best!!!
@@StevenKoon sure! 1) ConsumeJMS 2) ConvertCharacterSet 3) RouteOnContent 4) RemoveNewLineCharacter (both content types) 5) ExtractGrok (both content types) 6) JoltTransform ( One content type has 1 JoltTransform processor and the other has 3 ) 7) ReplaceText (the 2 content types converge from now on) 8) ReplaceText - This is the one i'm stuck in 9) ConvertJSONToSQL 10)PutSQL thank you!
I’m assuming that these numeric fields are showing up in the Json as strings. You could run the flowfiles into a queryRecord and cast(trim(fieldname as int)) . That may get you your result you want.
Hi Steven, I am new to Nifi and am trying to work on a use case where for every Insert in a Table I have to read the complete row and update another table value based on the one of the Column values of Table1. How do I do this? Should I write change feeds to a Kafka Topic and consume from that Topic? Also how for every row in the Topic I update the Table2? How can Table 2 sql be updating dynamic set clauses?
Hello GPS, For the first part you could use either the "routeOnAttribute" or the "routeOnContent". The first option will require you to turn the table value into a attribute first. Also, when you talk about a dynamic set clause. What part are you wanting to have dynamic? Do you need to change the table you are writing to based on the content of the flowfile?
@@StevenKoon : Thanks for your response Steven. My goal is to read from Table 1 and for each entry in Table 1 update Table 2. Both Table 1 and Table 2 will have a key in common. For e.g. Table 1 and Table 2 have Book id in common so if in Table 1 today there is an entry for Bookid 1 then in Table 2 update the Number of Subscribers Column with the vale in Table 1. The dynamic part here is that if I were to use a HTTPINVOKE processor or UPDATE SQL processor how do I pass value of each row to the web service and update Table 2 dynamically?
Hello, This was before I started to put my templates in my github. Is there a specific part for the flow that your interested in? If so I could do a new flow to demonstrate what your looking for and also make the template available.
The best tutorial about nifi, comprehensive and instructive!
I wish you would put more such tutorials.
Love this!!!
At "11:13" you can see the header needs a newline, as it is merged with the 1st line of data.
thanks for your videos they are very useful, greetings from Ecuador.
Awesome and thank you. Great to see that other from all over are able to get something out of the videos.
Thank you Steven.
I am learning a lot from watching your videos.
Happy to help!
Thanks. Your tutorials are great and helped a lot.
Glad they are helpful.
Can plz slove the , expression for increment or decrement values in attribute
will you please upload video how to upgrade nifi form previous version to new version without changing the flowfile.
Thank you very much for this tutorial.
Could you please make some videos on using wait and notify processors. It would be very much helpful for us.
Thanks in advance!
Hi Steve, excellent series of videos. I have a question, I am trying to merge identically formatted files from multiple sources into one file, the merge content processor recommends there should be only one source like in your demo. Which processor if any do you recommend could handle multiple inputs robustly?
Hi Steven....excellent videos.....Q: the .csv output of this video that gets generated from the MergeContent Processor comes out with the title and the first row in the same line. I followed all the steps you mentioned. I see it comes out the same way for you in the video...but I assume it should be two lines (one for the header, the second one for the first row)...how can I fix this? Thx.
While specifying the header, end it with a newline, i.e, by typing Shift+Enter. Now the first row should come in the next line.
Hello, the same thing happened to me. See how you can solve it, in the MergeContent processor in Header give it a new line 2 and leave it empty, so the content begins to be written from the second line
Hi Steven, how are you? thank you for sharing this tutorial. I would like to ask you something. I have developed a very similar flow process where the MergeContent works fine (1 flowfile in the merged port) when I stop the processor and I let flowfiles accumulated in the input port and then I run the MergeContent. Otherwise, If I run all my flow process, the mergecontent outputs 3 flowfile instead of 1. any ideas? thank you.
@leomax87
If you have you MergeContent setup for 1 max bin and are trying to make sure that you always get 1 merged flow file created. There is a property in the MergeContent processor that is call "Max Bin Age" set this to a minute or to a amount of time that makes since in your flow and this will make it so that the bin must be that old before it's released to create the flow file. That should help you. You can also use a wait/notify setup before the MergeContent to release all the flowfile at once to the MergeContent but that could still give you the same issue you have.
Hi Steven, like the rest of the users I am finding your videos very useful. In my use-case I would like to have a different delimiter (other than a comma) since the data I am pulling in is likely to also contain this character. I am surprised that the AttributesToCSV processor does not have a setting to adjust this - or maybe I am missing something. What are your thoughts on this? Thank you
Thank you Eric great question.
I would suggest using the ConvertRecord processor. If you data doesn't start off as JSON then you could use AttributesToJSON -> ConvertRecord. With the ConvertRecord you can use a JSON writer and a CSVRecordSetWriter. With the CSVRecordSetWriter you will have a lot of control over all the properties of creating the csv to include the "Value Separator" which is your delimiter in this case. I hope this help.
How you can handle CDC changes in nifi? I have to pull data from oracle where I need to check Max Id
Parm, Take a look at "QueryDatabaseTableRecord" processor. When you look at the property list you will see the "Maximum-value Columns" and check out the description on it. It will allow you to provides columns that you want to use to track data changes.
Steven Hello,
I have json files in Nifi that in some of the numeric fields in the file, I have a problem that the data contains spaces, which prevents me from uploading them into the Oracle database.
In Nifi, how can I delete all spaces in certain fields inside a Json file ??
Thank you so much for all the very interesting and very instructive videos of yours!
Hello,
I can think of a couple of ways to do this.
1. Use the ‘ReplaceText’ processor in the flow. This will allow you to setup a search value and replacement value with Regex Replace.
2. You can add the all the key:values in the json to the attributes of the flowfile or just the problematic ones and then use the ‘QueryRecord’ processor with calcite sql to fix the problem with a CAST or TRIM.
3. If you going to do further enrichment to the data in oracle later then you could just write them into Oracle as a varchar or char and cast or convert them later.
I hope this help.
@@StevenKoon Hello Steven,
first of all thank you very much!! The idea of RepalceText does provide an answer, but it currently removes Space from all fields, and I do not know how to get NIFI to remove only from the relevant fields.
To verify terminology: I used JOLT TRANSFORM as you taught in one of your tutorials, and now I have a JSON format object that contains a lot of fields, and a value for each field. There are specific fields where I want to remove the spaces from its values, while removing the spaces from other fields' values will also prevent me from entering the data into the database. Do you know how to make RepalceText work on only those specific fields?
Thank you so much Steven!! You're the best!!!
Can you list in order the Nifi processors that your using? I would like to understand what processor the source data starts with.
@@StevenKoon sure!
1) ConsumeJMS
2) ConvertCharacterSet
3) RouteOnContent
4) RemoveNewLineCharacter (both content types)
5) ExtractGrok (both content types)
6) JoltTransform ( One content type has 1 JoltTransform processor and the other has 3 )
7) ReplaceText (the 2 content types converge from now on)
8) ReplaceText - This is the one i'm stuck in
9) ConvertJSONToSQL
10)PutSQL
thank you!
I’m assuming that these numeric fields are showing up in the Json as strings. You could run the flowfiles into a queryRecord and cast(trim(fieldname as int)) . That may get you your result you want.
Hi Steven,
I am new to Nifi and am trying to work on a use case where for every Insert in a Table I have to read the complete row and update another table value based on the one of the Column values of Table1. How do I do this? Should I write change feeds to a Kafka Topic and consume from that Topic? Also how for every row in the Topic I update the Table2? How can Table 2 sql be updating dynamic set clauses?
Hello GPS, For the first part you could use either the "routeOnAttribute" or the "routeOnContent". The first option will require you to turn the table value into a attribute first. Also, when you talk about a dynamic set clause. What part are you wanting to have dynamic? Do you need to change the table you are writing to based on the content of the flowfile?
@@StevenKoon : Thanks for your response Steven. My goal is to read from Table 1 and for each entry in Table 1 update Table 2. Both Table 1 and Table 2 will have a key in common. For e.g. Table 1 and Table 2 have Book id in common so if in Table 1 today there is an entry for Bookid 1 then in Table 2 update the Number of Subscribers Column with the vale in Table 1. The dynamic part here is that if I were to use a HTTPINVOKE processor or UPDATE SQL processor how do I pass value of each row to the web service and update Table 2 dynamically?
Can you please give us this template
Hello, This was before I started to put my templates in my github. Is there a specific part for the flow that your interested in? If so I could do a new flow to demonstrate what your looking for and also make the template available.
@@StevenKoon yes I am interested, please