The line separator in Analytics outputs is normally a semicolon, but should not be ambiguous
In various exports from Analytics, it was noticed that the characters present in Alma are output as follows:
- sharp bracket < present in Alma is put out by Analytics as & l t ;
- sharp bracket > in the Alma (bibliographic data) material is put out by Analytics as & g t ;
- ampersand & as & a m p ;
This is likely due to HTML being used as the export format in Analytics. However, for the further processing of large amounts of data, it is disruptive that a semicolon is used for coding here. The semicolon appears both within individual data cells in Alma (e.g. in BIB data fields) as part of the data content or as a formal description character and also as a line separator in results from Analytics (every time a BIB category is repeated).
It would be beneficial if the ambiguous use of the semicolon in Analytics were flanked in such a way that the line separator is represented by a character that does not additionally appear in running texts or through the HTML conversion of sharp brackets or the ampersand.
We were referred to idea exchange.
-
Ute Ristau
commented
In my opinion, the fundamental problem is that the data is not being transferred correctly from Alma to Analytics. Characters that are correct in Alma are suddenly HTML-encoded in Analytics. Everything else is a consequence of this problem.
-
Johanna Looft
commented
In order to avoid missunderstandings: a specific line separator (whenever a MARC field is repeated within a bibliographic description) is necessary. What is missleading concerning further processing is the ambiguous use of a character in data exports.
-
Andreas Weber (USB Köln)
commented
The usability of exports must not be compromised by missing or incorrect character encoding. This issue needs to be resolved urgently. Especially in automated processing, no errors should occur, as the data is typically not reviewed manually beforehand.
-
Ute Ristau
commented
My experience with this is:
In Alma, correctly displayed characters are transferred to Analytics as HTML-encoded characters, e.g.
in Alma: < = in Analytics: <
in Alma: > = in Analytics: >
in Alma: & = in Analytics: &As a result, it is not possible to filter the data correctly.
Furthermore, it is not possible to filter by these characters (e.g. <). Repeatable Alma fields that are to be split again in Analytics cannot be split correctly. Too many columns are created.