Tuesday, April 18, 2023

C# Parse Linux stat command timestamp with timezone


Here's a prototype on how to handle datetime formats not supported by the limits of the .NET framework's DateTime and DateTimeOffset format strings. 


This code will parse UNIX/LINUX datetime with timezone, specifically as generated by stat command.

stat .bashrc| grep Modify
Modify: 2014-03-30 23:14:47.658210121 -0500

We normally would expect to parse this format  using custom DateTimeOffset format with a ...ParseExact statement.

"yyyy-MM-dd HH:mm:ss.FFFFFFFFF zzz"

On first blush, we expect this to work. A custom format should handle the input we are trying to parse. Note the 9 Fs in a row.  

But this is not supported in the .NET framework. 

This is because .NET datetime format strings will not support consuming more than 7Fs in a row. 7Fs is the limit in terms of accuracy. I was a little shocked at this. Custom formats should handle any combination and permutation thrown at it, but more importantly this is a canonical Linux timestamp! This should have been part of the test suite guys! 

So, here's a way to parser stat command timestamp, but it will lose a tiny amount of accuracy.

The regex here is used to validate the stat datetimeoffset format and truncate the last 2 digits, losing the picoseconds (10^-9) in resolution.  Satisfactory in my situation.

Note: The lowest datetime resolution measurement is a single tick which represents one hundred nanoseconds or one ten-millionth of a second. There are 10,000 ticks in a millisecond (see TicksPerMillisecond) and 10 million ticks in a second.

Here's the static code in case JIT site expires.


using System;using System.Globalization; using System.Text.RegularExpressions;
					
public class Program
{
	public static void Main()
	{
		//Unix Stat CMD CANNOT BE FORMALIZED USING DATETIME and.or DATETIMEOFFSET FORMATS in .NET
		//                                         FFFFFFFFF  not accepted!
		
		string dateString = "2010-11-29 17:56:22.000000000 -0800"; 
		DateTimeOffset unixstatdt = new DateTimeOffset();  
                //                                                                123456789
		if (DateTimeOffset.TryParseExact(dateString, "yyyy-MM-dd HH:mm:ss.FFFFFFFFF zzz", CultureInfo.InvariantCulture, DateTimeStyles.None, out unixstatdt))
		{
				Console.WriteLine("Worked parsed Unix stat datetime with time zone\n        {0}", unixstatdt.ToString ("o"));
		}
		else
		{
			Console.WriteLine("Failed to parsed Unix stat datetime");
		}
		
		//SOLUTION 
		//WE MUST TRUNCATE THE VALUE, CRUDE BUT IT WORKS down to 7 characters from 9
		//REGEX TO VALIDATE AND CONSUME STAT CMD
		string LinuxStatCMDpattern = @"(\d{4}-\d{2}-\d{2})\s(\d{2}:\d{2}:\d{2}\.)(\d{7})\d{2}\s ([\+-]\d{4})";
                string substitution = @"$1 $2$3 $4";
                //ACCEPTED F's of length 7 max
		//                                   123456789
		string input = @"2021-03-17 08:53:39.540802643 +0100";
        RegexOptions options = RegexOptions.Singleline | RegexOptions.IgnorePatternWhitespace;
        
		Regex regexstatcmd = new Regex(LinuxStatCMDpattern, options);
		
		string result = string.Empty; 
		
		if (regexstatcmd.IsMatch(input)) 
			result = regexstatcmd.Replace(input, substitution);					          
		
		Console.WriteLine("Trimmed "  + result);
                //                                                            1234567
		if (DateTimeOffset.TryParseExact(result, "yyyy-MM-dd HH:mm:ss.FFFFFFF zzz", CultureInfo.InvariantCulture, DateTimeStyles.None, out unixstatdt))
		{
				Console.WriteLine("Pass 2: Worked parsed Unix stat datetime with time zone\n        {0}", unixstatdt.ToString("o"));
		}
		else
		{
			Console.WriteLine("Pass 2: Failed");
		}
		
		
	}
}

See it in action using my Clipboard PowerTool ( https://clipboardplaintextpowertool.blogspot.com/ ) which allows you to copy any of the below dates and translate that to current time zone and readable datetime stamp!!!



Date Time FormatDFT Date Time Now
UTC2023-04-19T16:25:39Z
ISO-86012023-04-19T16:25:39+0000
RFC 2822Wed, 19 Apr 2023 16:25:39 +0000
RFC 850Wednesday, 19-Apr-23 16:25:39 UTC
RFC 1036Wed, 19 Apr 23 16:25:39 +0000
RFC 1123Wed, 19 Apr 2023 16:25:39 +0000
RFC 822Wed, 19 Apr 23 16:25:39 +0000
RFC 33392023-04-19T16:25:39+00:00
ATOM2023-04-19T16:25:39+00:00
COOKIEWednesday, 19-Apr-2023 16:25:39 UTC
RSSWed, 19 Apr 2023 16:25:39 +0000
W3C2023-04-19T16:25:39+00:00
UNIX STAT COMMAND TIMESTAMP2023-04-19 16:25:39.658210121 +0000
YYYY-DD-MM HH:MM:SS2023-19-04 16:25:39
YYYY-DD-MM HH:MM:SS am/pm2023-19-04 04:25:39 PM
DD-MM-YYYY HH:MM:SS19-04-2023 16:25:39
MM-DD-YYYY HH:MM:SS04-19-2023 16:25:39

No comments:

Post a Comment