Skip to content
Advertisement

getting HTML source or rich text from the X clipboard

How can rich text or HTML source code be obtained from the X clipboard? For example, if you copy some text from a web browser and paste it into kompozer, it pastes as HTML, with links etc. preserved. However, xclip -o for the same selection just outputs plain text, reformatted in a way similar to that of elinks -dump. I’d like to pull the HTML out and into a text editor (specifically vim).

I asked the same question on superuser.com, because I was hoping there was a utility to do this, but I didn’t get any informative responses. The X clipboard API is to me yet a mysterious beast; any tips on hacking something up to pull this information are most welcome. My language of choice these days is Python, but pretty much anything is okay.

Advertisement

Answer

In X11 you have to communicate with the selection owner, ask about supported formats, and then request data in the specific format. I think the easiest way to do this is using existing windowing toolkits. E,g. with Python and GTK:

#!/usr/bin/python

import glib, gtk

def test_clipboard():
    clipboard = gtk.Clipboard()
    targets = clipboard.wait_for_targets()
    print "Targets available:", ", ".join(map(str, targets))
    for target in targets:
        print "Trying '%s'..." % str(target)
        contents = clipboard.wait_for_contents(target)
        if contents:
            print contents.data

def main():
    mainloop = glib.MainLoop()
    def cb():
        test_clipboard()
        mainloop.quit()
    glib.idle_add(cb)
    mainloop.run()

if __name__ == "__main__":
    main()

Output will look like this:

$ ./clipboard.py 
Targets available: TIMESTAMP, TARGETS, MULTIPLE, text/html, text/_moz_htmlcontext, text/_moz_htmlinfo, UTF8_STRING, COMPOUND_TEXT, TEXT, STRING, text/x-moz-url-priv
...
Trying 'text/html'...
I asked <a href="http://superuser.com/questions/144185/getting-html-source-or-rich-text-from-the-x-clipboard">the same question on superuser.com</a>, because I was hoping there was a utility to do this, but I didn't get any informative responses.
Trying 'text/_moz_htmlcontext'...
<html><body class="question-page"><div class="container"><div id="content"><div id="mainbar"><div id="question"><table><tbody><tr><td class="postcell"><div><div class="post-text"><p></p></div></div></td></tr></tbody></table></div></div></div></div></body></html>
...
Trying 'STRING'...
I asked the same question on superuser.com, because I was hoping there was a utility to do this, but I didn't get any informative responses.
Trying 'text/x-moz-url-priv'...
http://stackoverflow.com/questions/3261379/getting-html-source-or-rich-text-from-the-x-clipboard
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement