Forum Discussion

yu's avatar
Occasional Contributor
6 years ago

How extract html string from json response.

Hi all! Would you be so kind and help with this -

response from server is this in Json format: 

"Success": true,
"Message": "OK",
"Result": "\r\n<div class=\"row contact table-row\" data-id=\"551215\" data-disabled=\"False\" data-keydisabled=\"False\">\r\n <input id=\"returnCompanyId\" name=\"returnCompanyId\" type=\"hidden\" ... 
"Title": ""

I used Transfer and have string in TestCase property 

 <div class="row contact table-row" data-id="551215" data-disabled="False" data-keydisabled="False"> <input id="returnCompanyId" name="returnCompanyId" type="hidden" value="374982" /> <div class="icon-cell col-md-1-half table-cell">  ....


 Is it possble to parse dinamic value of  data-id with groovy script? 


 Can anybody help how to parse this value?  Thank you! 



  • Ok - it's took me a while but I've got it working using XPATH (like the CDATA handling passing the content from one property to another) - but there is a problem with the content.


    you have a value of &nbps; in your content that the parser doesn't like - once I'd removed this - I got it working with the embedded functionality and I used an event handler to replace the '&nbps;' char anyway.


    Just so you can follow what I did - I created a VIRT that would return your response




    I ran the request and did a property transfer of the Result object, passing the contents of the Result object (html string) to a property (see next image)



    I then did a transfer of the html string (in the Properties step) to a second Property, but specifying the data-id tag attribute (see next image)


    as you can see - the data-id attribute value was successfully picked up in the transfer (THERE IS A PROBLEM HERE - and I'll come back to it) - for completeness the next image is the Properties step - here you can see the 2 properties that were extracted successfully



    So - all looks ok - right?  Well there is a problem using XPATH to extract the html attribute value you need.  There is an undeclared xml entity (not one of the standard 5) within your content - the value is '&nbsp;' within the html string that will cause a problem for the parser.


    If you search your response you can find it - the above approach will fail BECAUSE the content includes this.  I cheated inasmuch that I initially manually replaced the '&nbsp;' char encode with '&amp;' once the string had been transferred (after the first transfer).  There is a way around this (by using an event handler to replace this '&nbsp;' with ANYTHING that is standard xml) - however - it depends how important the content is in your tests before you start thinking about altering the response using an event handler.


    SO - if you don't replace the '&nbsp;', the testcase will fail on the property transfer step.  If you replace the '&nbsp;' with something that doesn't need declaring (like '&amp;') the test case will run through executing successfully (extracting the values into the Properties step for later use)


    As I state above - to get it to run successfully without any manual intervention - I used the 'RequestFilter.afterRequest event handler.  The code for the event handler was as follows:


    def content = context.httpResponse.responseContent
    content = content.replaceAll( "&nbsp;", "&amp;" ) // this replaces the nonbreakspace char for the ampersand code which doesnt need to be declared in xml content )
    context.httpResponse.responseContent = content

    So - this works completely automatically once I'd added in that event handler.  its not pretty and some of the other lads may have some whizzy way of using a groovy step in the test case to do all the work for you (but I can't help with that) - but if there's no problem replacing the nonbreakspace entity - the above will get you where you need to (i.e. extracting attributes from the html string)


    Hope this helps!











  • yu's avatar
    6 years ago

    Hi Richie, 

    thank you a lot for you help! 

    I used your solution and it works! it's wonderful! 

    I don't know what is  'RequestFilter.afterRequest event handler. 

    And did this solution in longer way but with the same logic as you discribed .

    Might it can be helpful for somebody:

    1)  Save Json request in Property1 using PropertyTransfer.

    2) Then in  Groovy script invoked this property and replace  &nbps with '&amp; as you discrabed. 

    and set this XML in an another Property with Groovy

     Code is : 

    def result = testRunner.testCase.testSuite.getPropertyValue("TempTarget")
    result = result.replaceAll( "&nbsp;", "&amp;")
    testRunner.testCase.testSuite.setPropertyValue("Extracted", result);;


    3) Use Property Transfer  and take  valie //div//@data-id from saved XML


    Best regards!

9 Replies

  • richie's avatar
    Community Hero



    I'm sorry - I'm kind of lost by your description - could you provide a littel more info please?


    What I got from your post is that you have a json response and 'Result' attribute value is actually an escaped html string.  The attribute value contains a data-id attribute within the div element with a value of '551215'.  You're using a property transfer to transfer the value of '551215' to a testcase level property - is that right?


    Are you saying that when you try the property transfer - the value being written to the testcase property is not '551215' but '55125'? is that right?  cos that value isn't even being truncated - it's a totally different value.  Or are you saying you already have a value written to the testcase property?


    As I say - I'm a little lost from your description - someone cleverer than me might understand - but I'd need a bit more info to get exactly where you're coming from





    • yu's avatar
      Occasional Contributor


       How is possble to parse a value from data-id attribute only? And write this value in property. 

      The value of data-id attribute is differenet for every response. 

    • richie's avatar
      Community Hero

      Hi yu


      I can see that you responded to my post with a response laying out the problem - but for some reason - I cannot see your response when viewing the web page itself - I can only see your response in my hotmail account?!?!? (did you delete your post after submitting?)


      Anyway - you explained - you are trying to extract the data-id attribute value (that is dynamic) from the response - the trouble being the 'data-id' attribute is actually within an escaped html string within your json.  This is a similar issue to what I've had - extracting CDATA values from within XML.  There's a OOB option to handling CDATA  where you pass a property to another property to extract the data you need - I tried tailoring that option - but I'd forgotten that I need to use saxon parser to do this.


      I need to use the saxon parse but the json option - I'm looking at this now to see if I can get it to work


      The other option is groovy - I tried putting the required groovy together quickly based on the help other people have given me before now (e.g. @msiadak,@Lucian etc.) but I was getting compilation errors - so I think I'm gonna leave the groovy to the experts!


      It might take me til tomorrow before I get the saxon parse working for the json - so for now - the groovy is probably quicker for someone else to knock out





      • yu's avatar
        Occasional Contributor


         Is it possble to parse a  dinamic value from data-id attribute with groovy script? 

        The value of data-id attribute is differenet for every response.  I need use it  in property for another POST. 


        Kind regards,