Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError when parsing ical with non-ASCII characters in UID #128

Open
pomeloy opened this issue Jun 16, 2023 · 1 comment
Open
Labels

Comments

@pomeloy
Copy link

pomeloy commented Jun 16, 2023

icalparser.py crashes on trying to parse an ical file with non-ASCII characters in the UID field. I know, RFC 5545 3.1 says you are not supposed to do that, but IMO this should not break the parser.

icalparser.py#L208

    event.uid = component.get("uid").encode("utf-8").decode("ascii")

This obviously fails with a non-ASCII character.

I am not exactly sure why the encoding/decoding is necessary, but either removing this or adding a try-except block defaulting to str(uuid4()) would fix this. Did I miss something here?

@mbafford
Copy link

I stated encountering this for my use-case a few days ago as well.

In my case the UID value is:

b'\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd'

I'm syncing to Google Calendar with https://github.com/andrewramsay/ical_to_gcal_sync so I need unique and reliable ID values. So instead of using a uuid(), I added a hash of the DTSTAMP and SUMMARY values:

    uid = None
    if component.get("uid"):
        try:
            uid = component.get("uid").encode("utf-8").decode("ascii")
        except UnicodeDecodeError as ex:
            pass
    
    if not uid:
        import hashlib
        uid = hashlib.md5((
            str(component.get("dtstamp").dt.timestamp()) +
            str(component.get("summary"))
        ).encode('utf-8')).hexdigest()

    event.uid = uid

I don't know how reliable this will be, but in my case it'll mean those two values have to change which will result in a delete+recreate on Google Calendar but the overall end-state will be the same. As long as I minimize that with a non-random UID, it's good for me.

The broken event looks like this:

BEGIN:VEVENT
DESCRIPTION:\nNOTE: This is a Multi customer booking. Log into Bookings to 
 see customer information and notes for this event.\n\nBooking Info\n------
 --------------REDACTED\n\nManage booking\n
-------------------------------------\n<URL>\n\nLearn more
  https://aka.ms/JoinTeamsMeeting\n
UID:����
SUMMARY:Appointment
DTSTART;TZID=Eastern Standard Time:20231201T173000
DTEND;TZID=Eastern Standard Time:20231201T180000
CLASS:PUBLIC
PRIORITY:5
DTSTAMP:20231214T000000Z
TRANSP:OPAQUE
STATUS:CONFIRMED
SEQUENCE:3
LOCATION:
X-MICROSOFT-CDO-APPT-SEQUENCE:3
X-MICROSOFT-CDO-BUSYSTATUS:BUSY
X-MICROSOFT-CDO-INTENDEDSTATUS:BUSY
X-MICROSOFT-CDO-ALLDAYEVENT:FALSE
X-MICROSOFT-CDO-IMPORTANCE:1
X-MICROSOFT-CDO-INSTTYPE:0
X-MICROSOFT-DONOTFORWARDMEETING:FALSE
X-MICROSOFT-DISALLOW-COUNTER:FALSE
X-MICROSOFT-REQUESTEDATTENDANCEMODE:DEFAULT
X-MICROSOFT-ISRESPONSEREQUESTED:FALSE
END:VEVENT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants