Sorting output by date

nzblue_fish
nzblue_fish's picture
User offline. Last seen 2 weeks 4 days ago. Offline
Joined: 09/30/2009
Points: 1165

Hi mashers,

back again ...

I need to sort RSS items by the published date before outputing it via a mashup. The date format I start out with is dd/mm/yyyy but as a string I can't simply sort by that, so I need to format it as yyyymmdd, then sort. However, since it's part of the RSS item, the date is contained in <pubDate/> and needs to stay in the format I already have.

So, I initially thought that I could create a temporary field in EMML I use for the input mashup with a date value in the yyyymmdd format, then sort on that, and then extract all the RSS nodes except that for the temporary date. Sounds feasible, but a bit of effort involved.

So, then I thought, since the sort in Wires clearly uses XPath(?), could I sort on the actual <pubDate/> value but reformat it as part of the sort syntax. For example could I do something like: /*:rss/*:channel/*:item/reformatmydate(*:pubDate) in the sort block sort on statement? Could it be as easy as that and which XPath function(s) would I need to apply.

As always, thanks in advance.

Innes (NZ)

 

 

0
Your rating: None
aishmishra
aishmishra's picture
User offline. Last seen 10 weeks 6 days ago. Offline
Joined: 09/24/2008
Points: 3

Hi Innes,

There is a part in Presto Developer Documentation which explains sorting on numbers or date types.

Here is the link to that part of documentation

http://www.jackbe.com/prestodocs/v2.7.0/prestolibrary/wwhelp/wwhimpl/api...

Do let us know how it goes

smitchell
smitchell's picture
User offline. Last seen 13 hours 53 min ago. Offline
Joined: 08/29/2008
Points: 34

Your analysis is basically correct, however, XPath doesn't have a built-in function that takes date strings in other formats and transforms them to the yyyymmdd format needed to cast the string to a date. 

There is an undocumented, custom XPath function named ISODateFormatExt in Presto (which will soon be documented! thanks for reminding me :) that can convert many common date/time formats to the ISO date/time format required by the XPath xs:date or xs:dateTime functions to cast a string to a date or dateTime. You can use this, combined with the appropriate casting function to get pubdate into a form that <sort> will understand.

So the basics of a mashup that does this would look something like this: 

<mashup
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.jackbe.com/2008-03-01/EMMLSchema/ ../src/schemas/EMMLSpec.xsd"
xmlns="http://www.jackbe.com/2008-03-01/EMMLSchema"
xmlns:fn="java:com.jackbe.jbp.jems.clientEMMLPrestoFunctions"
name="mymashup">
<!-- declare date formats for conversion function -->
<variable name="pubdateFormat" type="string" default="MM/dd/yyyy"/>
<variable name="isoFormat" type="string" default="yyyy-MM-dd"/>
<!--invoke rss service and put output in variable myrss -->

 <sort inputvariable="$myrss" sortexpr="/rss/channel/item"   sortkeys="xs:date(fn:ISODateFormatExt(pubdate, $pubdateFormat, $isoFormat))"
outputvariable="result" />
...
</mashup>

 

 

 

Sara, technical writer/jackbe

 

nzblue_fish
nzblue_fish's picture
User offline. Last seen 2 weeks 4 days ago. Offline
Joined: 09/30/2009
Points: 1165

Hi guys,

thanks for the replies. I'm trying a couple of options just for the fun of it, but Sara, the ISOFormatDate function errors during execution. Here's the error message from the debug console:

<code>

Error on line 21 of *module with no systemId*:
  XPST0017: Cannot find a matching 3-argument function named
  {java:com.jackbe.jbp.jems.clientEMMLPrestoFunctions}ISODateFormatExt()

</code>

Cheers, Innes (NZ)

raj
raj's picture
User offline. Last seen 1 week 4 days ago. Offline
Joined: 09/22/2008
Points: 4

Hi Innes,

actually, fn namespace should be declared as

xmlns:fn="java:com.jackbe.jbp.jems.client.EMMLPrestoFunctions"

Note the '.' between client and EMMLPrestoFunctions.

Let us know if u still have issues.

raj.  chief masher @ jackbe

nzblue_fish
nzblue_fish's picture
User offline. Last seen 2 weeks 4 days ago. Offline
Joined: 09/30/2009
Points: 1165

Hi Raj,

Thanks that worked great. 

I went around in circles for ages getting some very nasty errors when I ran the mashup. I couldn't figure out what was wrong until I realised that my intermediate XML document had a rather odd structure:

<?xml version="1.0" encoding="UTF-8"?>

<xml>

    ... all my child nodes ...

</xml>

and that the way I was crafting the sortexpr resulted in a null pointer error when the sort executed. I ended up with the following sort statement that didn't error:

        <sort inputvariable="$RSSItems" sortexpr="/xml/child::*"
              sortkeys="xs:date(fn:ISODateFormatExt(pubDate,$pubDateFormat,$isoFormat)) ascending" outputvariable="$RSSItems" />

Appreciate all the help from you guys ... once again.

Cheers, Innes (NZ)

WebTechMan
WebTechMan's picture
User offline. Last seen 21 weeks 4 days ago. Offline
Joined: 12/18/2009
Points: 9

I was wondering if the approach would also fix my date sorting issues.

<variable name="pubdateFormat" type="string" default="DD, dd MM yyyy hh:mm:ssZ"/>
<variable name="isoFormat" type="string" default="yyyy-MM-dd"/>
<sort name="sort_6" inputvariable="merge_5_out" outputvariable="output_0" sortexpr="/*:rss/*:channel/*:item" sortkeys="xs:dateTime(fn:ISODateFormatExt( *:pubDate, $pubdateFormat, $isoFormat ) ) descending" />
 

My Pubdate looks like this: "Thu, 18 Mar 2010 23:42:31 GMT+00:00"

How do I sort this by date?

Thanks,

Dan

Daniel Hudson
@WebTechMan

smitchell
smitchell's picture
User offline. Last seen 13 hours 53 min ago. Offline
Joined: 08/29/2008
Points: 34

I believe (but haven't actually tried this) that you should be able to use the same XPath function to handle this. What you need is the correct format string for your input date.

The definitive source for this is the JDK SimpleDateFormat class which you can find in API docs for Java on the web. I'm looking at the JDK1.5 docs, but if you're using 1.4.2 you need to check the Java documentation. For 1.5, use: 

EEE for the "Thu" part of the date.

dd MMM yyyy for the date itself

HH:mm:ss for the time, with one caveat. Use HH for the hour if this is a 24-hour clock from 0-23, but use kk if it is 1-24

z for the GMT time zone. 

 

Sara, technical writer/jackbe

 

nzblue_fish
nzblue_fish's picture
User offline. Last seen 2 weeks 4 days ago. Offline
Joined: 09/30/2009
Points: 1165

Hi Dan,

I recently had to create an RSS feed that needed a date in the format that you are trying to use. I've used the date conversion function in quite a few emml scripts, but I had to work out the correct syntax for that date declaration first. I have found this webpage to be the best yet and keep it permanently as a favorite.

I've created the following emml script for you that shows how to do the sort based on this date format. I've tested it and know it works. Hope this helps with whatever you are working on.

Cheers, Innes

<mashup xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.jackbe.com/2008-03-01/EMMLSchema/ ../src/schemas/EMMLSpec.xsd"
    xmlns="http://www.jackbe.com/2008-03-01/EMMLSchema" xmlns:macro="http://www.jackbe.com/2008-03-01/EMMLMacro"
    xmlns:fn="java:com.jackbe.jbp.jems.client.EMMLPrestoFunctions" name="test16">
    <!--
        Note the namespace declaration for the EMMLPrestoFunctions above which
        is needed to call the ISODateFormatExt function in the sort.
    -->
    <operation name="runTest16">

        <output name="sortedTestDoc" type="document" />

        <variables>
            <!--
                check out the following webpage for formatting of the dates:
                http://java.sun.com/j2se/1.4.2/docs/api/java/text/SimpleDateFormat.html
            -->
            <variable name="RFC822DateFormat" type="string"
                default="EEE, d MMM yyyy HH:mm:ss z" />
            <variable name="isoFormat" type="string" default="yyyy-MM-dd" />
            
            <!-- here's our test document to be sorted -->
            <variable name="testSortDoc" type="document">
                <sortItems>
                    <item>
                        <position>1</position>
                        <sortdate>Thu, 18 Mar 2010 23:42:31 GMT+00:00</sortdate>
                    </item>
                    <item>
                        <position>2</position>
                        <sortdate>Fri, 08 Jan 2010 12:01:30 GMT+00:00</sortdate>
                    </item>
                    <item>
                        <position>3</position>
                        <sortdate>Wed, 17 Feb 2010 09:30:00 GMT+00:00</sortdate>
                    </item>
                </sortItems>
            </variable>
        </variables>
        
        <!-- Sort the test document based on the date which is in the RFC822 date form
             which may occur in RSS feeds
        -->
        <sort inputvariable="testSortDoc" sortexpr="/sortItems/child::*"
            sortkeys="xs:date(fn:ISODateFormatExt(sortdate,$RFC822DateFormat,$isoFormat)) ascending"
            outputvariable="sortedTestDoc" />

    </operation>
</mashup>