SAP Data Services – Importing Image Data

 

Quite recently I was working on an SAP Data Services project which involved importing a great deal of data about fruits including photographs of the different varieties. There was a large amount of work to do and I must admit that whenever I thought about this photographic element of the import I developed a kind of coders twitch; I would hit google, notice the total lack of suggestion as to how this can be achieved and I would then push it very deliberately to the bottom of my pile. This is not normally how I work, I generally tend to do the hardest things first make stunning progress and then realise how much I had underestimated the complexities of the ‘simpler tasks’! But this was image data which is probably my least favourite thing, I like data… Eventually there came a point when I had no pile left under which to push this job any longer and so I had to assert myself to the job in hand.

Lets start by stating the problem, which lets face it, could have been much worse. We were given a spreadsheet containing many different varieties of fruits along with an awful lot of data, most of which is to be frank, completely irrelevant to this blog. This document also contained a relative path for each variety of fruit which described the location of a jpg file which was the picture itself. My task was to get all of this data into a database to allow the reporting services  to report against. So lets take a simplistic look at the data:-

As you can see we have a Variety Column which describes the name of the fruit Variety  along with a Product column which describes the Type of fruit. Row 1 is thus a variety of ‘Mango’ called ‘Amelie’ whilst Row 2 is a variety of ‘Passion Fruit’ called ‘Edulis’. For each of these varieties we also have a ‘Photo’ column which in this instance stores a relative path (from the context of the location of the document itself). The base path of this document is in this instance:-

K:\ImportFiles\Shared\Photos\

Helpfully the first  row we look at in our earlier screen capture has a null value but the next row contains a valid relative path of

Exotic photos\Edulis.JPG

Using this relative path and the known location of the document itself we can thus build a path that details the location of the image relating to the Edulis Passion Fruit:-

K:\ImportFiles\Shared\Photos\Exotic photos\Edulis.JPG

Nice and simple, so we have the path now how do we go about translating that path into BLOB data  representing the actual image itself? This is where I struggled to find information, I scanned the function library looking for a ‘LoadImage’ method or similar before hitting the internet itself, after turning over some pretty scabby stones I found a forum post that was pretty illegible though it did contain a morsel of gold dust. Turns out there is no specific load function within the SAP library which I had already ascertained. There was however a fairly simple, but equally obtuse method for loading image data which involves writing the path name to a Text File  using a certain format and then opening and importing that data using a standard workflow. Lets investigate just how that works. We start by processing our spreadsheet data into a format that can be written to our text file, the transformation  I wrote looks like this:-

As you can see when you examine this first query I am merely removing any leading/trailing blanks and ensuring that null values are replaced with an empty string. Nothing tricky there nor even necessarily relevant to you.

The next operation is where the real work is done.

For the purposes of this demonstration we are only really interested in the ‘Photo’ field where as you can see we are building a string that looks a lot like this:-

<<K:\ImportFiles\Shared\Photos\Exotic photos\Edulis.JPG>>

The appending of the angle brackets will later tell SAP data services to treat the data contained within them as the path for an image that should be loaded. You needn’t worry yourself with the replace function that I have had to use, this is peculiar to my data alone. In reality your code will look more like this:-

'<<K:\ImportFiles\Shared\Photos\\' ||  Query.PHOTO || '>>'

This data can then be written out to a staging file, my file is formatted like this, note how the Photo field we are really interested in has been defined as a blob.

We then need to import the data, we thus create a new data flow looking much like this one using our new staging file type as the source.

The query component with regard to the Photo field itself simply maps the data together, in my instance I wanted to remove some blanks in another field which is why the query is even necessary and the data is then simply written into a blob field in the target database.

And that my friends, is mission accomplished…. This blog will self destruct in 10 seconds.