Previous 1 2 Next 26 Replies Latest reply: Nov 11, 2007 6:57 PM by Justin Surpless
Justin Surpless Level 4 Level 4 (1,490 points)
OK... I just noticed something strange... I had some movie files that I deleted sometime yesterday (don't need them anymore)... just now, I wanted to see to retrieve one of them with TM and they're gone!

My backup drive is nowhere near full and TM didn't warn me about deleting anything... looking at my backups, I noticed that I only have backups for yesterday at 8:03AM and from 8:09PM (and on)... I'm guessing that's the case because anything before 8:09PM is now more than 24 hrs old... hence, it's 'thinned out'... I also noticed that I only have my initial backup from Friday... not the several it made afterwards... and that backup from Friday is missing some data I had that was captured by the later backups...

What's going on here? Seems to me that it should have all of my stuff, no?

Mac Pro 2.66, 4.0 GB RAM, 23 Apple Cinema HD Display, Mac OS X (10.5), EyeTV 200, Powerbook G4/667, 512 MB RAM
  • jdelima Level 3 Level 3 (515 points)
    The way i read the Apple info on TM makes it look like a GFS backup scheme to me. That means that it should keep hourlies for the last 24 hrs, dailies for a month and weeklies for as long as you have space. I assumed from this that it made good use of full, differential and incremental backups to acheive this, which should mean that in any point of time you should be able to restore anything at all.

    The exception to this would be if you created a file (or modifed it) AFTER a TM backup, and then erased (or modified it again) BEFORE the next backup. That sort of change would be lost.

    I guess we'll only know for sure exaclty how useful it is once people have had it running for a while to see exactly how it operates. If it isn't operating as a normal GFS backup would, then it's mostly wasting space for me!
  • Justin Surpless Level 4 Level 4 (1,490 points)
    That is my understanding as well... that if you create a file but you delete it during the interval between backups... that is completely reasonable... the only solution would be to have instantaneous backup and that's clearly no solution!

    What I'm observing is not quite this... as another example, a backup just occurred at 8:32 PM for me... as such, it deleted the aforementioned 8:09PM from the previous day... I had thought that the differences were supposed to be concatenated somehow...

    So far it seems that (as Apple has stated) if a file is created and deleted between hourly backups, you're on your own... but it also seems that if a file doesn't stay around for more than a day, you're on your own once 24 hrs have passed... also, it seems to be keeping the 1st backup of the day (it kept 10/26 at 7:20PM and 10/27 at 8:03AM)... so I lost any files created and deleted on Friday after 7:20 and similarly for after 8:03AM on Sat...

    I'm going to create a test file and let it be captured by the next hourly... then I plan to delete it... as such, it should be in the backup set... and then in 24 hrs, I plan to see if I can still see the file...

    Message was edited by: Justin Surpless
  • jdelima Level 3 Level 3 (515 points)
    Please do test this! Sounds fishy to me!
  • Ewal Level 1 Level 1 (0 points)
    I don't think it does any kind of concatenation. Time Machine just records what your system looked like at different points in time. Short lived files will disappear as you have seen. The same thing will happen for a file that exists only for a couple days in the middle of the week. Eventually, Time Machine will get rid of the daily backups (keeping only weekly backups) and the file will no longer be in any available backup.
  • Justin Surpless Level 4 Level 4 (1,490 points)
    I think that you're right... just not quite what I expected... I guess I got caught because I had data which had been on my drive for a long time (and thus not short term in my mind)... but Time Machine didn't know that... it only saw the file and then within a day, it didn't see it... hence, it considered it short term and was deleted once the day passed...

    So to summarize... this seems to be what will happen...

    Files created

    1) lasting less than 1 hr will NOT be backed up, period...

    2) lasting longer than an hour BUT less than a day, will be deleted after 24 hrs when yesterday's backups are 'thinned'

    3) lasting longer than a day BUT less than a week, will be deleted after 1 month when the previous month's are 'thinned'

    So... it seems that only files lasting longer than 1 week will remain on the drive permanently (barring the need to 'thin' for space reasons)

    As I have thought more about it, this seems to be reasonable...
  • Justin Surpless Level 4 Level 4 (1,490 points)
    This is exactly what TM is doing...
  • perthomsen Level 1 Level 1 (0 points)
    Justin Surpless wrote:
    As I have thought more about it, this seems to be reasonable...

    Reasonable? I don't think so! It's an 'easy way out' engineering-wise, but it is most definitely the wrong way to do it.

    This way you won't have any guarantees that your backups will capture a short-lived file that you may want to find at a later date. It is reasonable (sort of) that a file that is created and deleted before TM runs is not captured. But... If a file is backed up, and after 24 hours is 'unbacked up', how do you explain that to a user trying to look for the file?

    What makes sense to me is a more complicated, but more correct system, where all files are equal after they have been backed up, and will only be deleted when you run out of space on the backup disk.

    The problem is that the thinning of backups after 24hrs is arbitrary to the user, and results in some short-lived files being in the 'daily' backup, and others not, without any discernable pattern.

    The concept of 'the longer the file stuck around on the hard drive, the more valuable it must be' seems flawed to me. You could have a file that you worked on for hours and hours on end during one day, but if it wasn't created during the 'first' backup of the day, it will get deleted from all backups after 24hrs. And, if it happened to be in the 'lucky' hour of the daily backup, too bad if it was a not the 'first day of the week', because then the file will be gone from TM in 1 week.

    Thanks,
    Per
  • Justin Surpless Level 4 Level 4 (1,490 points)
    It's definitely the 'easy way out'... but you need to understand that it is discernable as to what files will ultimately be 'lost'... I think I read somewhere that Time Machine isn't trying to backup every single file but to intelligently back up the files you 'need' (Ars Technica, perhaps?)...

    The theory being as follows... if you create a file during a given period of time (governed by the backup interval) which lasts long enough to be captured by a backup scan, it has to exist until the start of the next interval period in order for it to be considered worth keeping by TM...

    So... Apple 'decided' that a file that doesn't last more than 24 hrs is not 'worth being kept' once 24 hrs have passed... similarly for a file that lasts more than 24 hrs but less than a week is not worth being kept after a month...

    I'm not saying that I think it's perfect... I'm just trying to convey that it is discernible in what it purges and so on...
  • perthomsen Level 1 Level 1 (0 points)
    Justin Surpless wrote:
    It's definitely the 'easy way out'... but you need to understand that it is discernable as to what files will ultimately be 'lost'... I think I read somewhere that Time Machine isn't trying to backup every single file but to intelligently back up the files you 'need' (Ars Technica, perhaps?)...

    The theory being as follows... if you create a file during a given period of time (governed by the backup interval) which lasts long enough to be captured by a backup scan, it has to exist until the start of the next interval period in order for it to be considered worth keeping by TM...

    So... Apple 'decided' that a file that doesn't last more than 24 hrs is not 'worth being kept' once 24 hrs have passed... similarly for a file that lasts more than 24 hrs but less than a week is not worth being kept after a month...


    I've been trying to do a bit of research on this. What TM itself says, is that it backs up hourly for 24 hours, and it keeps dailies for one month, and weeklies as long as there is space.

    If the way you are explaining it is accurate, you can have a file that lives for 23 hours (and is in the hourly backups for 22 or 23 hours), but is not part of the daily backup because it was created after the one that becomes the daily backup after the thinning, and was deleted before the one that will become the next daily backup.

    It is of course a lot harder to build a system that picks up these files, and includes them in the daily and weekly backups, because TM is basically a point-in-time backup system. If the file you are looking for didn't exist in any of the points-in-time that we have backups for (which is some time each day for the dailies, and some time each week for weeklies), you're out of luck.

    It is possible to discern what is being kept and what is being lost, only to the extent that if you know that a file was around when a backup that's being kept longer than 24 hours (or 30 days), you'll know what files will be kept. That is my main concern here.

    I generally think TM is a great leap forward, but I think it needs some serious overhaul.

    Thanks,
    Per
  • jdelima Level 3 Level 3 (515 points)
    It could be better implemented for sure. For the average Joe though, it's fantastic and that includes most users who have never had inbuilt backup. Sure it could get crafty with incrementals and differentials to have a complete bakcup set, but consider it from the averages Joe's perspective when restoring a file. They can restore what is backed up - which is hourlies for the previous 24 hrs, dailies for the previous month, and weeklies forever.

    If you have all of the hourly data essentially versioned and incorporated into the daily, then Joe wants to restore from 3 days ago, if he has a file there that he has changed every single hour for that particular day, then he has 24 possible files to restore! It becomes a versioning nightmare. Though my example is extreme it's possible.

    Also consider what is currently done for backups. People manually, or to a schedule, do point in time backups. Not as frequently as time machine, but it still happens. They have the exact same issues, with the exception that TM may have people becoming relaxed about their data, falsely believing it always going to be in a backup set somewhere.

    Sure it has it's flaws, but overall it's better there than not.
  • Justin Surpless Level 4 Level 4 (1,490 points)
    Oh, it definitely needs some work... I (and I'm sure many others) would like to have the ability to backup to multiple drives (different directories)...
  • Justin Surpless Level 4 Level 4 (1,490 points)
    I think what's rubbing 'perthomsen' (and what was/is rubbing me) is that the way TM works now, it can lull you into a false sense of security in thinking that your file is backed up and will remain backed up... the primary issue that I have is that I don't believe Apple explains this anywhere... all they say on the MacOS X website is that

    "Only files created and then deleted before the next hourly backup will not be included in the long term. Put another way: You’re well covered."

    Now, notice that they say 'long term' but it is subjective... personally, I was expecting that if a file was captured in an hourly but was then deleted... TM would somehow merge the hourly backups into a daily (not simply take one of the hourlies as the daily)... if this was the way, I think that it would work more as advertised in that if a file was backed up once, it'd be brought forward again... and in the case of a continuously changing file that you mentioned below... I think it would be reasonable to keep only the last one past a period of time... no?

    For me, it's not so bad because I was using 'Chronosync' and it only ran once per day... so I already had it in my mind that a file created and deleted in one day would NOT be captured... similarly, once I deleted a file - I only have X days before it was purged... anyways...
  • perthomsen Level 1 Level 1 (0 points)
    jdelima wrote:

    If you have all of the hourly data essentially versioned and incorporated into the daily, then Joe wants to restore from 3 days ago, if he has a file there that he has changed every single hour for that particular day, then he has 24 possible files to restore! It becomes a versioning nightmare. Though my example is extreme it's possible.


    Yeah, but that is what computers are good at, though. You can even put a simple UI on this, where the most recent version (the last edited version before its untimely deletion) is the only thing you see, unless you want to delve deeper into the depths of versions of the file.


    Also consider what is currently done for backups. People manually, or to a schedule, do point in time backups. Not as frequently as time machine, but it still happens. They have the exact same issues, with the exception that TM may have people becoming relaxed about their data, falsely believing it always going to be in a backup set somewhere.


    Yeah, point-in-time backups are always an issue, because they are just a snapshot, and not a journal of things done to change a file, so you can't recover something if it was never in a point-in-time backup.


    Sure it has it's flaws, but overall it's better there than not.

    I totally agree with that. Getting people to use some backup (even if flawed) is way better than the dismal state of backups for end users. I guess I'm just advocating for improving even further the state of backups, to a point where you can actually rely on them 99.999%.

    In case anyone is interested, I have been participating in an Open Source backup project "Box Backup" for the last few years. It supports the keeping of deleted files as long as the backup server has space left. It also solves another TM problem, in that it backs up only what changed in a large file (like a Parallels virtual Disk), so you don't need to do full backups of every file every time it changes. It has many other features as well, such as network backups, encrypted backups, and multiple platform support (Windows, Mac, Linux, and more).

    Go to http://www.boxbackup.org/ to check it out.

    This, however is not a solution (yet) for the average Joe to use. It requires a server on which to run the backup server (which could be a Mac), and some setup, and use of Terminal.app, etc. But it does solve many of the issues with TM.

    Thanks,
    Per
  • jdelima Level 3 Level 3 (515 points)
    I completely agree with you and Per. The dilema as i see it is where do you draw the line between version control and backup, if every piece of data is backed up, as would be nice? The weeklies (in a ridiculous example) could in theory for a file changed every hour of every day of the week, hold 168 different versions of one document!

    The only way i see of handling that, is to have all the documents the the weekly backup set with their original timestamps clearly noted. Not a hard ask, i don't think.
Previous 1 2 Next