{"id":787,"date":"2010-05-12T21:15:40","date_gmt":"2010-05-13T03:15:40","guid":{"rendered":"http:\/\/thesmithfam.org\/blog\/?p=787"},"modified":"2019-08-12T07:15:31","modified_gmt":"2019-08-12T13:15:31","slug":"smart-folder-synchronization-with-python","status":"publish","type":"post","link":"https:\/\/thesmithfam.org\/blog\/2010\/05\/12\/smart-folder-synchronization-with-python\/","title":{"rendered":"Smart Folder Synchronization with Python"},"content":{"rendered":"<p>I have various files that download to my computer automatically. They arrive at different times of the day or night. When they do, I like to transfer them to a network drive on another computer automatically.<\/p>\n<p>But there&#8217;s a wrinkle.<\/p>\n<p>When the files auto-download to my computer, they go into <strong>one<\/strong> folder automatically, but when I transfer them to my network drive, I like them to go into <strong>different<\/strong> folders, based on their file name.<\/p>\n<p>Python to the rescue.<\/p>\n<p>I whipped out this little script that runs from cron every 5 minutes on my Mac, rsync&#8217;ing files from my &#8220;Downloads&#8221; folder into a specific subfolder of &#8220;\/Volumes\/Shared&#8221;, following a set of rules. In this script, any file with the string &#8220;dave&#8221; in its name gets copied to the &#8220;\/Volumes\/Shared\/Dave Stuff&#8221; folder. Any file with &#8220;bob&#8221; in its name gets sent to &#8220;\/Volumes\/Shared\/Files for Bob&#8221;, and so on.<\/p>\n<p>Here it is for  your enjoyment:<\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\n#!\/usr\/bin\/python\r\n\r\n# Filename filters, and which folders to send them to:\r\nfilters = {\r\n    # Filename  :  Dest Folder\r\n    &#039;dave&#039;      : &#039;Dave Stuff&#039;,\r\n    &#039;bob&#039;       : &#039;Files For Bob&#039;,\r\n    &#039;frank&#039;     : &#039;Franks Junk&#039;,\r\n    }\r\n\r\nsrc   = &#039;\/Users\/Dave\/Downloads&#039;\r\ndest  = &#039;\/Volumes\/Shared&#039;\r\nrsync = &#039;rsync --times &#039;\r\n\r\n# ----------------------------------------------------------\r\n\r\nimport os;\r\nimport sys;\r\nimport subprocess;\r\n\r\n# Only show progress when we&#039;re running in a terminal (and not cron):\r\nif sys.stdout.isatty():\r\n    rsync = rsync + &#039;--progress &#039;\r\n\r\nfor dir, dirs, files in os.walk(src):\r\n    for filename in files:\r\n        if filename.startswith(&quot;.&quot;) or filename.endswith(&quot;.part&quot;):\r\n            continue\r\n        fullpath = os.path.join(dir, filename)\r\n        for filter, destfolder in filters.iteritems():\r\n            if filename.lower().find(filter) &gt;= 0:\r\n                fulldest = os.path.join(dest, destfolder)\r\n                print &quot;Copying &#039;&quot; + filename + &quot;&#039; to folder &#039;&quot; + destfolder + &quot;&#039;&quot;\r\n                cmd = rsync + &#039; &quot;&#039; + fullpath + &#039;&quot; &quot;&#039; + fulldest + &#039;\/.&quot;&#039;\r\n                process = subprocess.Popen(cmd, shell=True)\r\n                try:\r\n                    process.wait()\r\n                except KeyboardInterrupt:\r\n                    process.kill()\r\n                    sys.exit(1)\r\n                break\r\n        else:\r\n            print &#039;Could not find a home for file &quot;&#039; + filename + &#039;&quot;&#039;\r\n<\/pre>\n<p>When this script runs, it just blindly tells rsync to transfer the files, but rsync will only transfer them if they are newer in the &#8220;src&#8221; folder than the &#8220;dest&#8221; folder. That&#8217;s thanks to rsync&#8217;s &#8220;&#8211;times&#8221; argument, which tells rsync to preserve the file times when it does the transfer.<\/p>\n<p>So far it&#8217;s working great. I called it &#8220;sync-download-files&#8221; and created a crontab file called \/Users\/Dave\/etc\/crontab, that looks like this:<\/p>\n<p><code>*\/5 * * * * ~\/bin\/sync-download-files >\/dev\/null 2>&1<\/code><\/p>\n<p>Then I ran this crontab command to install the job:<\/p>\n<p><code>crontab \/Users\/Dave\/etc\/crontab<\/code><\/p>\n<p>And voila! Files now auto-sync to \/Volumes\/Shared every 5 minutes. When there are no files to sync, the script completes in a couple seconds.<\/p>\n<p>By the way, in my case, \/Volumes\/Shared is a Samba mounted network share to a WDTV Live box.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I have various files that download to my computer automatically. They arrive at different times of the day or night. When they do, I like to transfer them to a network drive on another computer automatically. But there&#8217;s a wrinkle. When the files auto-download to my computer, they go into one folder automatically, but when [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-787","post","type-post","status-publish","format-standard","hentry","category-code-and-cruft"],"_links":{"self":[{"href":"https:\/\/thesmithfam.org\/blog\/wp-json\/wp\/v2\/posts\/787","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/thesmithfam.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/thesmithfam.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/thesmithfam.org\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/thesmithfam.org\/blog\/wp-json\/wp\/v2\/comments?post=787"}],"version-history":[{"count":19,"href":"https:\/\/thesmithfam.org\/blog\/wp-json\/wp\/v2\/posts\/787\/revisions"}],"predecessor-version":[{"id":1519,"href":"https:\/\/thesmithfam.org\/blog\/wp-json\/wp\/v2\/posts\/787\/revisions\/1519"}],"wp:attachment":[{"href":"https:\/\/thesmithfam.org\/blog\/wp-json\/wp\/v2\/media?parent=787"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/thesmithfam.org\/blog\/wp-json\/wp\/v2\/categories?post=787"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/thesmithfam.org\/blog\/wp-json\/wp\/v2\/tags?post=787"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}