Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xmp: error no XMP Tag found even though multiple XMP tags are found #45

Open
mholt opened this issue Jan 8, 2023 · 7 comments
Open
Assignees
Labels
bug Something isn't working needs refactoring Code needs refactoring for performance/function

Comments

@mholt
Copy link
Collaborator

mholt commented Jan 8, 2023

I'm using the example from the readme on a JPEG file:

m, err := imagemeta.Parse(f)
if err != nil {
    return nil
}
fmt.Println(m.Xmp())

but I get an error of xmp: error no XMP Tag found (i.e. EOF), even though this file has two XMP sections, with these offsets from 0:

$ grep --binary --text --byte-offset --only-matching '<x:xmpmeta' PXL_20230104_032015182.MP.jpg 
8557:<x:xmpmeta
10480:<x:xmpmeta

The first one is for the JPEG, the second one I believe is for an embedded video (motion picture).

When I add some fmt.Printf() lines, I see that the XmpHeader after parsing is showing an offset of 75971, which... is either clearly wrong, or is not offset from 0.

I've cloned the repo and am trying to get a grasp for how the readers work, but there's lots of sectioning and ReadAt() and it's a little hard for me to follow.

Any ideas or troubleshooting tips?

Thanks for this package!

@mholt
Copy link
Collaborator Author

mholt commented Jan 8, 2023

Additionally, I have a .heif file that has an XMP section:

$ grep --binary --text --byte-offset '<x:xmpmeta' 20220423_085935.heif 
974650:debuginfo<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 5.1.0-jc003">

But the same code yields an error: xmp: error no XMP Tag found

@evanoberholster
Copy link
Owner

@mholt, any chance you could share those images with me. That would assist in debugging.

This library is definitely no perfect and written with the goal of performance, so there are likely cases where it needs to be tweaked.

@mholt
Copy link
Collaborator Author

mholt commented Jan 8, 2023

@evanoberholster Absolutely! Sorry, that would have been the obvious helpful thing to do :)

Here's one: https://drive.google.com/file/d/1mmDRBPfxDenvCQz0wezygyGhcWdoNJHk/view?usp=sharing

One thing I have noted in my debugging is that there are XMP fields that are longer than 1.5 KB; notably, I think the MakerNotes on this picture are about 70-80 KB. I've seen some up to 100 KB.

When I raise maximum value and buffer sizes, I am also seeing that Peek() in the readAttributeValue loop somehow stops returning more than about 60 KB at a time, resulting in never finding the closing delimiter. This picture should have makernotes going from 9830 to 78430, but the read buffer will never be large enough to see the closing quotes at 78430.

One thing I also tried in my debugging is to change the signature of Xmp() from returning xmp.XMP to []xmp.XMP, since a file may have multiple XMP documents. (In this file's case, I suspect the second one is for the embedded mp4 file, i.e. motion photo. But my application can know to use only the first or the second depending on what it's doing.)

Thank you so much for the reply! I've been trying a few other libraries yours readme mentions before I found this one, and this one looks the most maintained, so I really appreciate that.

Edit: Oh yeah, and here's a link to the comment with a HEIC image. I haven't had a chance to start debugging it yet.

One thing I appreciate about your lib is that it doesn't rely on <?xpacket as a start token; as I've found that some photo files don't have it. But I think they all use <x:xmpmeta at least.

@evanoberholster
Copy link
Owner

@mholt. Thanks for sending me that jpg file. It was eye opening. I had not come across a similar file layout as in this image. When I am debugging a file structure I normally use ./exiftool -htmldump image.jpg > image.jpg.html as it gives a good idea of the structure.

In the JPEG that you posted it appears to have APP0 JFIF tags which I hadn't dealt with in my library before. c89196d

Most of the XMP files that I work with come from Lightroom and therefore I have used a smaller buffer (1.5KB). Thank you for the suggestion, I will look for more XMP files to doing testing with.

This go library is definitely undergoing some refactoring and rewriting that you will see under the develop branch. Thank you for the suggestion of returning an []xmp.XMP instead of an xmp.XMP

I will also take a look at the HEIC image in the next few days.

@mholt
Copy link
Collaborator Author

mholt commented Jan 9, 2023

@evanoberholster That's awesome, thank you! I hadn't noticed the develop branch.

I'm kind of new to parsing metadata, so I've learned a ton this week. And wow, exiftool -htmldump is truly useful, thanks for the tip.

Of the three metadata libraries I've tried so far -- and all of them have struggled with the first few images I've tried -- this is the only one that appears to be maintained. So thank you, thank you again for that. It's a breath of fresh air.

Can I possibly sponsor you or donate?

Whatever I can do to test changes or help with development, please let me know!

@evanoberholster
Copy link
Owner

evanoberholster commented Jan 11, 2023

@mholt Thank you for being willing to support this library. The biggest contribution that you could make would be Issues, PRs, and suggestions. I will be pushing a large rewrite soon and would appreciate suggestions and testing. Thanks for your willingness.

Looking further at your jpg file shows the following.

The first XMP tag has standard headers with proprietary tags from "http://ns.google.com/photos/1.0/camera/"

<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?> 
    <x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="XMP Core 5.5.0"> 
        <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> 
            <rdf:Description 
                rdf:about="" 
                xmlns:GCamera="http://ns.google.com/photos/1.0/camera/" 
                xmlns:GContainer="http://ns.google.com/photos/1.0/container/" 
                xmlns:GContainerItem="http://ns.google.com/photos/1.0/container/item/" 
                xmlns:xmpNote="http://ns.adobe.com/xmp/note/" 
                GCamera:MotionPhoto="1" GCamera:MotionPhotoVersion="1" GCamera:MotionPhotoPresentationTimestampUs="901378" 
                GCamera:shot_log_data="SERSUALvZDVtXnAeLOrjQ5je9sQSDxYeI6WFw+b24jWUOjuJIwNBymft1qWNdGIcTknjXN1nON2pLxaI/RIBi0jHkkTdui6TXLoSxBepOCPXJ096WepulMkoxom/IXi9BlzNvFBkCwWOsDvYpJW3qbfkh89NkYt6XoPrjd/YEu6W77AJyhMbo54w8oSw8YbGPjppnZ9rtG+x8wJzT8iuYOVITZcvzgRQWbqp2rJBI3zPIVm+aTiUsgV0SeIWLmRy5qu++LjnIPmCYmPmO6PbALNfTMctudf7DTYzvlxSQYauHBUte8vX/K/unY655KFFowOYb+sm6PSMh4/1Hq59Xq0UqEQ8xi8gghtuvw8yDjzBdet7ZrxxWMPOwgFrUFFoJj1jkbf0HnKJc3+QgaSWXSpbfeJ8D6/WhgccGkUjpOOiWCXXzdFpmdwkkFQDpYA2usHboQ4nUEFaP7o8UmAmuBOOfhrcvtGxHAosVDswt6GpZdnYG/1dQYCSJ0yhOzH2cbIagxywGyjGSjxAH3H/6AnWiGVTB+Y0TU+XxEkn+ERRdpcG1ktw/NNW8lf50M/2sqoR66Sw7yzujkMkqM3ETzkpMh9NUg/uRZk8aTSWA93DtWKU8PAgF5qao6G38UPzXFLBJ6tlh6EAnth+0NPh8aJWN+jgE/WqzqqMaRGXiB3q+AlYrsDriVS9ZKPPd7xfjtyVWVhAg024bBrK2q1atiuZztPSdwVFPQE=" 
                xmpNote:HasExtendedXMP="F482C5BBD3889E0BC7D8F325D346F3AE"> 
                <GContainer:Directory> 
                    <rdf:Seq> 
                        <rdf:li rdf:parseType="Resource"> 
                            <GContainer:Item GContainerItem:Mime="image/jpeg" GContainerItem:Semantic="Primary" GContainerItem:Length="0" GContainerItem:Padding="0"/> 
                        </rdf:li> 
                        <rdf:li rdf:parseType="Resource"> 
                            <GContainer:Item GContainerItem:Mime="video/mp4" GContainerItem:Semantic="MotionPhoto" GContainerItem:Length="2037777" GContainerItem:Padding="0"/> 
                        </rdf:li> 
                    </rdf:Seq> 
                </GContainer:Directory> 
            </rdf:Description> 
        </rdf:RDF> 
    </x:xmpmeta>
<?xpacket end="w"?>

The second XMP tag contains an XMP extension "F482C5BBD3889E0BC7D8F325D346F3AE" that appears to be mentioned in the first XMP tag. It contains what appears to be binary data from a maker note.

<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="XMP Core 5.5.0"> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description rdf:about="" xmlns:GCamera="http://ns.google.com/photos/1.0/camera/" GCamera:hdrp_makernote="SERSUALvZDVtXnAeLOrjLFZmiFHXojfpBfD.....

The third XMP tag contains another XMP extension "F482C5BBD3889E0BC7D8F325D346F3AE" that appears to be the continuing bytes from the second XMP tag.

.....jqYS8mX7cfnw070R4="/> </rdf:RDF> </x:xmpmeta>

Unfortunately I don't have the expertise or the strong interest in supporting proprietary xmp. The goal of this library is to support the basic xmp namespaces as listed here

@mholt
Copy link
Collaborator Author

mholt commented Jan 11, 2023

@evanoberholster Sounds good to me. Thanks for explaining that -- I totally understand not having the desire to support proprietary fields. Can your lib at least extract the bytes of the MakerNotes? It doesn't have to decode them. But if it's part of a standard xmp document then I hope we could at least access them for later.

I will be pushing a large rewrite soon and would appreciate suggestions and testing.

I'm looking forward to trying it! Will it have the ability to detect the file type? Or will I need to call a type-specific function?

@evanoberholster evanoberholster self-assigned this Jan 30, 2023
@evanoberholster evanoberholster added bug Something isn't working needs refactoring Code needs refactoring for performance/function labels Jan 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs refactoring Code needs refactoring for performance/function
Projects
None yet
Development

No branches or pull requests

2 participants