Monday, April 25, 2022

Right-to-Left Override RTLO Removal Tool for filenames #Windows #malware

Malware writers can trick you in 2 ways into thinking your file is a "PDF looking" file using the Right-to-Left Override (RTLO) technique.

“Right-to-Left Override” RTLO example

Firstly, maliciously constructed “.exe” can be built to display an PDF icon, so it looks like PDF default reader will open this file. If the filename is really long then, you can't see the extension (see image below). 

2ndly and may not be so obvious, malicious PDF filename is constructed as with a right-to-left override character is such a way that the file ends ".pdf" extension, but really is an ".exe".  


So in example below, the 2nd file looks like a ".txt" file, but is really a ".docx" file (the 1st file). The 1st file has been cleansed of the RTL Unicode character, and ends in ".docx". 

The PDF file is actually an ".exe" file, but looks like it will open with default PDF reader. 


Download RTLExample.7z ( it includes the above files with PDF ".exe" example. The files contain no viruses or malware. The PDF is safe ".exe", and just opens this page in Chrome). The "PDF" is safe ".exe", and just opens this page in Chrome. But GDrive marks these examples "Sorry, this file is infected with a virus", which good because they are detecting the RTL character and exe. But it a false positive, since there is no virus in the files. You can create you own examples by inserting the RTL character into the filename, see this video https://youtu.be/n2kV3Q2eTCY). 


Here's the same files as viewed from the command (cmd.exe) line. The box character represents the RTL character.


Note: Detection of malicious file is never done by a filename alone, so a good antivirus will flag the contents of this file, for known signatures. BUT you can remove the annoying RTL character with the free tool below! 

How is RTLO being abused by malware writers?

In apps that support Unicode like Window Explorer, the right-to-left override malware method uses  a RTL Unicode character, that will reverse the order of the characters that follow it. It's used mainly for Middle Eastern/Asian languages that you read right-to-left.

RTLO can be used to spoof fake extensions. To do this we need a hidden RTL Unicode character in the file name.

What is “Right-to-Left Override” RTLO?
The RTLO method is used to hide the true type of a file, so it might trick you into open text file (.txt) which really is a Word file (.docx) with malicious malware. More recently this file could hide a .wav file. Audio files such .wav file are being embedded with malware, is on the forefront of malware maliciousness. Read about that on my post here.
The method exploits a feature built into Windows Explorer. Since Microsoft Windows does a great job of supporting different languages from around the world, some of those languages that are written from right-to-left (RTL). 
Let’s say you want to use a right-to-left written language, like Hebrew or Arabic, on a site combined with a left-to-right written language like English or French. In this case, you would want bidirectional script support.
Bidirectional script support is the capability of a computer system to correctly display bi-directional text. In HTML we can use Unicode right-to-left marks and left-to-right marks to override the HTML bidirectional algorithm when it produces undesirable results:
left-to-right mark: ‎ (U+200E) Unicode character
right-to-left mark: ‏ (U+200F) Unicode character
How do you fix files that have the RTLO or other bad characters ? 

Here's a tool I built to clean up Right-to-Left Mark (and many others) and Unicode Control Characters from your files. It's super fast, small and written in native C++.

Updated Thu 21-Apr-22 - new build, fixed many recursive issues




Download
 touchRTL.7z (you need https://www.7-zip.org/ to unpack). For personal use only. Copy into c:\windows to use from cmd.exe.


License : 

touchRTL.7z personal use only, for commercial use buy touchLTRPRO. Contact as validated today available for license request. 

touchRTLPRO.7z, has flags to remove Unicode spaces and punctuations (math symbols, currency, open closing braces, and accent marks).  

Just run this command and it will recursively rename filenames to remove those characters under the specified directory name. If directory name, contains spaces you need quotes.


touchRTL -v -R -l -y "directory name"

where


Usage: touchRTL [-aclmpRuvxy] [-r REFFILE | -d DATETIME] PATH...

UNIX touch mimic, updates files access, modification and creation times of file(s) in PATH to the current time,
If PATH argument does not exist, creates corresponding new empty file or directory (using -y), unless -c or --n
Supports directory recursion and time stamping!
Supports Right-to-Left (RTL) character removal for files.
PATH argument can represent a filename(s) or directory. Double quote if it contains spaces. eg "c:\as is.txt"

  -a, --access-time        change only the file access time
  -c, --no-create          do not create any new files - If the file exists, touch will update the access time,
  -l, --RTL                remove Unicode control & format characters (esp. infamous right-to-left) from filena
  -m, --modif-time         change only the file modification time
  -p, --pause-exit         pause on exit (non-GNU extra)
  -R, --recursive          recursively touch files in specified directory and all subdirectories (non-GNU extra
  -u, --unicntrl           remove Unicode control characters only - https://www.fileformat.info/info/unicode/ca
  -v, --verbose            output the result of every file processed (non-GNU extra)
  -x, --creation-time      change only the file creation time (non-GNU extra)
  -y, --directory          specify directory, instead of default file
  -r, --reference REFFILE  use this file's times instead of current time
  -s, --spaces (PRO edtn)  remove Unicode spaces from filename
  -!, --puncs  (PRO edtn)  remove Unicode punctuations & symbols (math & modifiers) from filename

  -d, --date DATETIME      use YYYY-MM-DDThh:mm:ss[.ms] instead of current time (non-GNU, does not parse string
                           accepted "2033-04-01T07:07:07", "2033-04-01 07:07:07"

  -h, --help               Display this help and exit.

      --version            Display version information and license information.

For personal use only. Commercial license required for business use and removes page open. See --version for al
Copyright © 2019-2022 M. Pahulje <metadataconsult@gmail.com> - https://http://metadataconsulting.blogspot.com/

Thursday, March 31, 2022

Metamask crypto wallet phishing email with subject Metamask Withdrawal suspended

For the record, this is an Metamask crypto wallet  phishing email attempt that is recently going around, with subject "Metamask Withdrawal suspended"

What to do?  Report them, goto bottom of page. 


From : Metamask.io <no-reply@livestormevents.com>
Subject : 
Metamask Withdrawal suspended






PHISHING LINKs;

1. https://electri-tech.co.za/.well-known/

How to tell this is a Phishing email ?

  1. Check email address in full, if it's not from originating company then it's phishing.
  2. Hover over all links in email, if it's not from the  company's website then forget it.
  3. The best way is to 

How to examine Email Message Source ?

Now lets look at message source
  1. Outlook.com->Actions->View Message Source. 
  2. Gmail.com->More (down arrow to top right)->Show original.
Check for suspicious links, anything that does not originate from apple.com.


Report Phishing Email (not as Spam)

  1. Outlook.com->Junk (at Top)->Phishing Scam
  2. Gmail.com->More (down-arrow to top right)->Report Phishing 

Report Phishing

If you have received this email take it further at 

  1. https://www.google.com/safebrowsing/report_phish/


Report phishing at Microsoft and subsequently government agencies

  1. http://www.microsoft.com/security/online-privacy/phishing-faq.aspx
  2. Report Phishing Sites | CISA
  3. Home - Canada's Anti-Spam Legislation (fightspam.gc.ca)

Tuesday, March 22, 2022

C# Regular Expressions - Getting Substitutions Groupings Index and Length .NET Framework limitation workaround improvement

Reflecting on the .NET 20 anniversary, there are few shortcoming to the .NET Framework that still are glaring examples of tunnel vision. Firstly, is the lack of growth or concern of expanding the BCL with regards to greater coverage of Win32 APIs. This is the main reason why C++ is still around, because all the Win32 API are not implemented in .NET C#. Device driver developers are forced to use C++. 

Secondly, the focus of this post, is the lack of an correct implementation and stagnant vision / foresight in regards to dealing with regular expression substations in a programmatic way, since the inception of the C# language. 

For example, we are familiar with the matching of groups, which provides good in-sight into match of groups. 

match.Groups[1].Value;
match.Groups[1].Index;
match.Groups[1].Length;

But wouldn't this be nice, a mechanism to handle substitutions groupings for replacement strings.   

match.Replacements[1].Value = "$1";
match.Replacements[1].Index = 4;
match.Replacements[1].Length = 2; 
etc...

In this way, when I want to use  Regex.Replace to replace a match with substitution string 

@"Lorem$1AAA\$1BBB$1Lorem";

It would be nice to have a data structure to enumerate all the substitution groups ['$1'] in the substitution string. But one can easily code to get this, but it requires some tedious indexOf work.








Here's the resultant string, and this bug has been around for 20 yrs.

.NET 6 Result string (see live demo at dotnetfiddle.net)
Lorem Lorem AAA \Lorem BBB Lorem Lorem
Error/Bug : '\' is in result, but is used escape $1 and should not be part result. Work Around : Using matches and filtering the match == "\$1" from your results. Regex101 .NET Result string (see live demo at Regex101.com)
Lorem Lorem AAA Lorem BBB Lorem Lorem
Perl's implementation is the gold standard.
Regex101 PERL Correct Result string (see live demo at Regex101.com)
Lorem Lorem AAA $1 BBB Lorem Lorem

I'm suggesting a new option as well. 

RegexOptions.PERL

But the pièce de résistance, would to have the following structure populated for the substitution groups replacements values in the final resultant string!

replacement.Groups[1].Captures[0].Value = "Lorem";
replacement.Groups[1].Captures[0].Index = 6;
replacement.Groups[1].Captures[0].Length = 5;
replacement.Groups[1].Captures[1].Value = "Lorem";
replacement.Groups[1].Captures[1].Index = 132;
replacement.Groups[1].Captures[1].Lenght = 5; 
etc...

where
Groups[1] is $1
Captures[0] is repeated captures of $1 replacements (the string "Lorem") in output string!

Hey .NET Language team, above is my suggestions for a totally new language design component!


Workaround : You can get the indices of substitution group replacements this 
             but its a coding adventure.

Using matches group construct, you can pre-create an output string and replace the 
string with dummy characters for the groups not in the current match group. 
Then you can find the n-th index of the group substitution 
[$1 which evaluates to "Lorum"]in the dummy string to get the
correct indices and lengths. 
So substitution string 

Lorem Lorem AAA \Lorem BBB Lorem Lorem

becomes

XXXXXXLoremXXXXXXLoremXXXXXLoremXXXXXX

then you can find the correct locations of 'Lorem' ($1) in the resultant string. 

Trivially, unnamed groups you can use this; 

string fillunamedgroups = Regex.Replace(
                          s, @"\$\d+", delegate(Match match)
                          {
          return string.Concat(Enumerable.Repeat("⌀", match.Length));
                          });


See it in action with my .NET Regular Expression Test Tool 
soon to be available in Clipboard Plaintext PowerTool.