Wednesday, August 28, 2019

C-Sharp A Faster Unicode ReplaceAt method that works with surrogate pairs and 4-byte Unicode characters

Most "ReplaceAt" commonly methods seen online fail when replace a character at a specific position in a Unicode string.

Unicode String Replace At Issue

Lets examine Unicode string "🎢πŸ”₯Γ©-"

🎢 Unicode Character 'MULTIPLE MUSICAL NOTES' (U+1F3B6) - 4-byte Unicode character
πŸ”₯ Fire Emoji U+1F525 - 4-byte Unicode character
Γ©  Latin Small Letter e with Acute U+00E9 - 2-byte Unicode character
Unicode Character 'HYPHEN-MINUS' (U+002D) - 2-byte Unicode character

😊 Smiling Face with Smiling Eyes Emoji U+1F60A - 4-byte Unicode character (replacement)

🎢πŸ”₯Γ©- is length of 6, but there are ONLY 4 characters! Why not len=4?
🎢πŸ”₯ are double byte UNICODE characters (> \u10000) of width or len 2 each 
🎢πŸ”₯Γ©- below will replace space after lasting character '-' (position 4) with a sub using most common techniques seen online

This is due to the fact that Unicode code points outside of the Basic Multilingual Plane (BMP) > U+FFFF, are are represented in UTF-16 using 4 byte surrogate pairs, rather than using 2 bytes. 


Specifically, the High Surrogate (U+D800–U+DBFF) and Low Surrogate (U+DC00–U+DFFF) codes are reserved for encoding non-BMP characters in UTF-16 by using a pair of 16-bit codes: one High Surrogate and one Low Surrogate. A single surrogate code point will never be assigned a character.

To correctly count the number of characters in a string that may contain code points higher than U+FFFF, you can use the StringInfo class (from System.Globalization).

Below is an large enumeration of common ReplaceAt implementations available on internet. They all fail, except for one that using StringInfo. 

I have optimized this method UnicodeReplaceAtFastest to be fastest implementation so far, and beats old UnicodeReplaceAt by 2ms (on average).


Tuesday, August 27, 2019

C-Sharp How can I read raw (CF_HTML) clipboard data


Here's how to read raw HTML from Clipboard using P/Invoke Win32 Native methods specify clipboard data type 
CF_HTML


Avoid Clipboard.GetText(TextDataFormat.Html) and use the below P/Invoke, especially with .NET 4.0 Framework or less, because funny characters are introduced.

See my blogpost 
How to get HTML from the Windows system clipboard directly using PInvoke Win32 Native methods avoiding bad funny characters
using System;
using System.Runtime.InteropServices;
using System.Text;

//--------------------------------------------------------------------------------
http://metadataconsulting.blogspot.com/2019/06/How-to-get-HTML-from-the-Windows-system-clipboard-directly-using-PInvoke-Win32-Native-methods-avoiding-bad-funny-characters.html
//--------------------------------------------------------------------------------

public class ClipboardHelper
{
 #region Win32 Native PInvoke
 
 [DllImport("User32.dll", SetLastError = true)]
 private static extern uint RegisterClipboardFormat(string lpszFormat);
 //or specifically - private static extern uint RegisterClipboardFormatA(string lpszFormat);

 [DllImport("User32.dll", SetLastError = true)]
 [return: MarshalAs(UnmanagedType.Bool)]
 private static extern bool IsClipboardFormatAvailable(uint format);

 [DllImport("User32.dll", SetLastError = true)]
 private static extern IntPtr GetClipboardData(uint uFormat);

 [DllImport("User32.dll", SetLastError = true)]
 [return: MarshalAs(UnmanagedType.Bool)]
 private static extern bool OpenClipboard(IntPtr hWndNewOwner);

 [DllImport("User32.dll", SetLastError = true)]
 [return: MarshalAs(UnmanagedType.Bool)]
 private static extern bool CloseClipboard();

 [DllImport("Kernel32.dll", SetLastError = true)]
 private static extern IntPtr GlobalLock(IntPtr hMem);

 [DllImport("Kernel32.dll", SetLastError = true)]
 [return: MarshalAs(UnmanagedType.Bool)]
 private static extern bool GlobalUnlock(IntPtr hMem);

 [DllImport("Kernel32.dll", SetLastError = true)]
 private static extern int GlobalSize(IntPtr hMem);
 
 #endregion

 public static string GetHTMLWin32Native()
 {

  string strHTMLUTF8 = string.Empty; 
  uint CF_HTML = RegisterClipboardFormatA("HTML Format");
  if (CF_HTML != null || CF_HTML == 0)
    return null;

  if (!IsClipboardFormatAvailable(CF_HTML))
   return null;

  try
  {
   if (!OpenClipboard(IntPtr.Zero))
    return null;

   IntPtr handle = GetClipboardData(CF_HTML);
   if (handle == IntPtr.Zero)
    return null;

   IntPtr pointer = IntPtr.Zero;

   try
   {
    pointer = GlobalLock(handle);
    if (pointer == IntPtr.Zero)
     return null;

    uint size = GlobalSize(handle);
    byte[] buff = new byte[size];

    Marshal.Copy(pointer, buff, 0, (int)size);

   strHTMLUTF8 = System.Text.Encoding.UTF8.GetString(buff);
   }
   finally
   {
    if (pointer != IntPtr.Zero)
     GlobalUnlock(handle);
   }
  }
  finally
  {
   CloseClipboard();
  }

  return strHTMLUTF8; 
 }
}


Saturday, August 24, 2019

C-Sharp A Faster Unicode ReplaceAt method that works with surrogate pairs and 4-byte Unicode characters

Most "ReplaceAt" commonly methods seen online fail when replace a character at a specific position in a Unicode string.

Unicode String Replace At Issue

Lets examine Unicode string "🎢πŸ”₯Γ©-"

🎢 Unicode Character 'MULTIPLE MUSICAL NOTES' (U+1F3B6) - 4-byte Unicode character
πŸ”₯ Fire Emoji U+1F525 - 4-byte Unicode character
Γ©  Latin Small Letter e with Acute U+00E9 - 2-byte Unicode character
Unicode Character 'HYPHEN-MINUS' (U+002D) - 2-byte Unicode character

😊 Smiling Face with Smiling Eyes Emoji U+1F60A - 4-byte Unicode character (replacement)



🎢πŸ”₯Γ©- is length of 6, but there are ONLY 4 characters! Why not len=4?
🎢πŸ”₯ are double byte UNICODE characters (> \u10000) of width or len 2 each 
🎢πŸ”₯Γ©- below will replace space after lasting character '-' (position 4) with a sub using most common techniques seen online

This is due to the fact that Unicode code points outside of the Basic Multilingual Plane (BMP) > U+FFFF, are are represented in UTF-16 using 4 byte surrogate pairs, rather than using 2 bytes. 


Specifically, the High Surrogate (U+D800–U+DBFF) and Low Surrogate (U+DC00–U+DFFF) codes are reserved for encoding non-BMP characters in UTF-16 by using a pair of 16-bit codes: one High Surrogate and one Low Surrogate. A single surrogate code point will never be assigned a character.

To correctly count the number of characters in a string that may contain code points higher than U+FFFF, you can use the StringInfo class (from System.Globalization).

Below is an large enumeration of common ReplaceAt implementations available on internet. They all fail, except for one that using StringInfo. 

I have optimized this method UnicodeReplaceAtFastest to be fastest implementation so far, and beats old UnicodeReplaceAt by 2ms (on average).


Tuesday, August 20, 2019

CBC Stream URLs ended/stopped working/unavailable as of Aug 2019


CBC Stream Page no longer exists http://www.cbc.ca/radio/includes/stream.html 

CBC Audience Services (CBC) August 20, 2019

Thanks for reaching out to CBC.

CBC is no longer offering direct URLs for streaming as of late. Sorry to disappoint.

Moving forward CBC Streaming URLs (http://www.cbc.ca/radio/includes/stream.html) will only be provided by a third party like Google Home, Grace, Sirius or TuneIn. If you are using these third party devices, know that we have had issues with the stream over the last week and are currently working on repairing them with the aforementioned providers.


In the meantime, you can listen live to CBC Radio One and CBC Music on our web based streaming here: cbc.ca/listen or on the LISTEN Beta Testing App.

All the best,

Jamie
CBC Audience Relations

Wednesday, August 14, 2019

Amazon Phishing Email - Subject - Statement of your payment updated on

For the record, this is an Amazon phishing email attempt that is recently going around and made it through spam filters. What to do?  Report them, goto bottom of page.


From : "account-alert@amazon.com"
 
Subject
 : 
Re: RE : [ Daily Reminder ] [ Daily August Report ] - Statement of your payment updated on "Sunday - xxx...".  xxx Summary of News Information Report has been sent to..."Newsletter"



Here's  a preview in text 



Dear Customer, 
 
 
It looks like we lost some information for your account. To comply with applicable laws, Amazon needs to collect certain information from you to help make your Amazon account as safe as possible. 
 
 
Just log in amazon.com and follow the instructions in your account notifications to see what information you need to provide. Please send the missing information by 15 August 2019.  
 
 
Sincerely,  
 
 
Amazon.com  
========================= 

SPAM/ PHISHING LINKs;  

1. https://amazon-services-center.customerservice-help10.com/?dashboardaccount

How to tell this is a Phishing email ?

  1. Check email address in full, if it's not from originating company then it's phishing.
  2. Hover over all links in email, if it's not from the amazon.com site then forget it.

  3. The best way is to look at message source, see below.

How to examine Email Message Source ?

Now lets look at message source
  1. Outlook.com->Actions->View Message Source. 
  2. Gmail.com->More (down arrow to top right)->Show original.
Check for suspicious links, anything that does not originate from apple.com.


Report Phishing Email (not as Spam)

  1. Outlook.com->Junk (at Top)->Phishing Scam
  2. Gmail.com->More (downarrow to top right)->Report Phishing 

Report Phishing URLs at Google now 

If you have recievied this email take further action now by click these links

  1. https://www.google.com/safebrowsing/report_phish/


Report phishing at Microsoft and government agencies

  1. http://www.microsoft.com/security/online-privacy/phishing-faq.aspx

Report phishing emails to Amazon 

Send the e-mail to stop-spoofing@amazon.com
Note: Sending this suspicious e-mail as an attachment is the best way for us to track it.

Phishing Email - @ Walmart job selection's solution

For the record, here's another SPAM email that made it through filters.

From : ms.manager1@aol.com
Subject : @ job selection's solution



Dear Ma/Sir;
----------------------
 We have a customer service survey assignment in your location and we will pay $450 / assignment.
 Which would come in the form of a cashiers check for you to perform your assignment.
--------------------------------------------------------------------------
 The job entails an Evaluation process such as visiting Wal-mart/K-mart,e.t.c
 Send below information to get started If you are still Interested Applicants are to forward

-----------------------------
 > * name (first/last):
 > * address:
 > * city, state, & zip code:
 > * age, gender:
 > * phone:
 > * e-mail:

 Thank You for Your participation and being here with Us.

 Sincerely,
 The MS applications team
 (C) 2019 SR & I. All rights reserved.

Tuesday, August 13, 2019

Phishing Email - Hi,am Glroia,you have won 3.5 M USD. Send YES for claims.

For the record, a phishing email that made it past SPAM filters.

Email is from Correo Aurrerantz S.Coop. Coordinacion 
which seems legitimate.












However, if you reply you notice the email is different, and contains SPAM email address
gmkgmk24998@hotmail.com

Monday, August 12, 2019

C# Winforms Rendering Differences on Windows 7 vs Win10 with Scale Set at 150%

As an active Windows developer of a number of WinForm applications, I always have to be backwards compatible to Windows 7, and hence use .NET 4.0 Framework which run natively on Windows 7 new image. 

But recently, I noticed that my WinForm app was not rendering properly in Windows 10, but was working just fine on Windows 7? What gives, same framework right? 

Here's example of problematic DialogBox not rendering well in Windows 10




Here's example of corrected DialogBox properly rendering in Windows 10




Problem Description


I have a WinForm program that calls another WinForm Text Editor using standard
System.Diagnostics.Process.Start("texteditor.exe"); 
, but in Text Editor a number of dialog boxes would get the squeezed.

But I noticed, if I double-click on the TextEditor.exe directly, the dialog boxes would rendered properly! 

So I then tried run TextEditor.exe from the command line


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
Process process = new Process
{
    StartInfo =
    {
        UseShellExecute = false,
        RedirectStandardOutput = true,
        RedirectStandardError = true,
        CreateNoWindow = true,
        FileName = "cmd.exe",
        Arguments = "/C texteditor.exe"
    }
};
process.Start();
process.WaitForExit();

Still I go same results, a crunched dialog box.


Solution to Rendering WinForm Apps of Windows 10 with Scaling Factor set to something other than 100% 


The following worked to launch another exe with proper rendering of all dialog boxes in Windows 10.


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
Process process = new Process
{
    StartInfo =
    {
        UseShellExecute = false,
        RedirectStandardOutput = true,
        RedirectStandardError = true,
        CreateNoWindow = true,
        FileName = "Explorer.exe",
        Arguments = "texteditor.exe"
    }
};
process.Start();
process.WaitForExit();



Given in my Windows 10 Display Setting, has Scale and Layout set at 150% which caused the rendering issue. If scale is set to 100% the dialog box renders correctly. But at different scales the dialog box would be crunched.







Thursday, August 8, 2019

Building FRHED Free Hex Editor for Windows using VS 2010 Fixed

Building FRHED using Visual Studio 2010 on Windows




Frhed is an free open-source binary file editor for Windows. It's quite old (last modified in 2009) but works great.

Download FRHED C++ source files here
https://sourceforge.net/projects/frhed/files/



But I got the following error

LINK : fatal error LNK1181: cannot open input file '..Debug"\lang.res'

Here's the fix - PreLink.bat

This corrects build errors for versions 1.6.0 (stable) and 1.7.1 (alpha) releases.


rem Add metadataconsulting.blogspot.com Thu 08-Aug-19 5:08pm Markus 
rem Fix LINK : fatal error LNK1181: cannot open input file '..Debug"\lang.res'
rem add vcvars32.bat for link.exe to set path properly
call "C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin\vcvars32.bat"
cd
echo %0
echo $(IntDir) = %1
echo $(TargetPath) = %2
set IntDir = $(IntDir); 
set IntDirnoLastDoubleQuote=%IntDir:~1,-1%
set IntDirnoLastDoubleQuoteBackslash=IntDirnoLastDoubleQuote\
echo $(IntDirnoLastDoubleQuoteBackslash) = %3

cd ..\Translations\Frhed
cscript CreateMasterPotFile.vbs
cscript UpdatePoFilesFromPotFile.vbs
cd ..\..\FRHED

rem rc /v /fo%1lang.res /i.. ..\Translations\Frhed\heksedit.rc
rem rc /v /fo ".\..\BuildTmp\heksedit\Debug\lang.res" /i.. ..\Translations\Frhed\heksedit.rc

rc /v /fo %3lang.res /i.. ..\Translations\Frhed\heksedit.rc

rem copy %1lang.res %2
mkdir %2\..\Languages

rem "C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin\link.exe" if needed
link /DLL /NOENTRY /MACHINE:IX86 /OUT:%2\..\Languages\heksedit.lng %3lang.res
copy ..\Translations\Frhed\*.po %2\..\Languages

Tuesday, August 6, 2019

Remove all Unicode Control Characters and special control characters fast in C-Sharp
















Here's how to remove all Unicode control characters from a string fast in C-Sharp, strictly speaking. 

However, you may consider Unicode Character 'RIGHT-TO-LEFT OVERRIDE' (U+202E) notionally as a Control Character, but is official categorized as General Punctuation. 

In code example below isSpecialUnicodeCntrlChr() method removes right-to-left and left-to-right characters.

Take a look at these Unicode characters and decide what you consider a control character
https://unicode-table.com/en/blocks/general-punctuation/

using System;using System.Text;using System.Diagnostics;using System.Collections.Generic;
					
public static class Program
{
	//https://unicode-table.com/en/blocks/general-punctuation/ add some special cases
	public static bool isSpecialUnicodeCntrlChr(this Char c)
    {
		//remove left-to-rights and right-to-lefts
        switch (c)
        {
            case '\u200E': //Left-To-Right Mark
            case '\u200F': //Right-To-Left Mark
			case '\u202A': //Left-To-Right Embedding
			case '\u202B': //Right-To-Left Embedding
			case '\u202D': //Left-To-Right Override
			case '\u202E': //Right-To-Left Override
			case '\u2066': //Left-To-Right Isolate
			case '\u2067': //Right-To-Left Isolate
			
			//https://unicode-table.com/en/blocks/general-punctuation/ add more 
			//case '\u2060': //Word Joiner
			//etc....	
                return true; 
            default:
                return false;
        }
	}
	
	public static string RemoveUnicodeControlChars(this string s) {
		
		StringBuilder sb = new StringBuilder(s.Length);
		for (int i = 0; i < s.Length; i++) 
			if ( !Char.IsControl(s[i]) && !s[i].isSpecialUnicodeCntrlChr() )
				sb.Append(s[i]);
			
    	
		return sb.ToString(); 
	}
	
	// create a lookup hashset
	private static HashSet<char> specialUnicodeCtrlChr = new HashSet<char>(new char[] {'\u200E','\u200F','\u202A','\u202B','\u202D','\u202E', '\u2066', '\u2067'} );

	public static string FilterUnicodeControlChars(this string str)
	{
		// tempbuffer
		char[] buffer = new char[str.Length];
		int index = 0;

		// check each character
		foreach (var ch in str)
			if ( !Char.IsControl(ch) && !specialUnicodeCtrlChr.Contains(ch))
				buffer[index++] = ch;

		// return the new string.
		return new String(buffer, 0, index);
	}
	
	
	public static void Main()
	{
		Stopwatch sw = new Stopwatch(); 
		Console.WriteLine("Hungarian\bGrand\t\t\r\vPrix\u202EF1");
		sw.Start();
		Console.Write("Hungarian\bGrand\t\t\r\vPrix\u202EF1".RemoveUnicodeControlChars());
		sw.Stop(); 
		Console.WriteLine(" in {0} ticks.",sw.ElapsedTicks );
		Console.WriteLine();
        sw.Reset();
		sw.Start();
		Console.WriteLine("ŐhᒰHung\u2063arian\u008D\bGrand\t\t\r\vPrix\u202EF1".RemoveUnicodeControlChars());
		sw.Stop(); 
		Console.WriteLine(" in {0} ticks.",sw.ElapsedTicks );
		Console.WriteLine();
		Console.WriteLine();
		Console.WriteLine("Using HashSet Filtering");
		sw.Start();
		Console.Write("Hungarian\bGrand\t\t\r\vPrix\u202EF1".FilterUnicodeControlChars());
		sw.Stop(); 
		Console.WriteLine(" in {0} ticks.",sw.ElapsedTicks );
		Console.WriteLine();
        sw.Reset();
		sw.Start();
		Console.WriteLine("ŐhᒰHung\u2063arian\u008D\bGrand\t\t\r\vPrix\u202EF1".FilterUnicodeControlChars());
		sw.Stop(); 
		Console.WriteLine(" in {0} ticks.",sw.ElapsedTicks );
		
	}
}

Sunday, August 4, 2019

Remove all ANSI Control Characters fast in C-Sharp




Here's how to reduce a string to ANSI and remove control characters from a string fast in C-Sharp. But be careful since, remove Γ© is not replaces with e. Todo that you need normalize the string, see UnicodetoAscii function. 

ASCII (American Standard Code for Information Interchange) is a 7-bit character set that contains characters from 0 to 127.

The generic term ANSI (American National Standards Institute) is used for 8-bit character sets. These character sets contain the unchanged ASCII character set. In addition, they contain further characters from 128 to 255.

BTW: You can do this on the fly in windows with my Clipboard Plaintext PowerTool


Here's a list of control characters. https://unicode-table.com/en/blocks/general-punctuation/


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
using System; using System.Text; using System.Linq; using System.Diagnostics; 
					
public static class Program
{
	// Based on http://www.codeproject.com/Articles/13503/Stripping-Accents-from-Latin-Characters-A-Foray-in
	// Proper Normalization
	public static string UnicodeToANSI(this string inString)
	{
		var newStringBuilder = new StringBuilder();
		newStringBuilder.Append(inString.Normalize(NormalizationForm.FormKD)
								.Where(x => (x > 30 && x <= 255))
								.ToArray());
		return newStringBuilder.ToString();
	}
	
	//ANSI characters 32 to 127 correspond to those in the 7-bit ASCII character set,
	public static string ReducetoASCII(this string s)
    {
        StringBuilder sb = new StringBuilder(s.Length);
        foreach (char c in s)
        {
            if ((int)c > 255) // remove chars > 127
                continue;
            if ((int)c < 32)  // remove  control characters 
                continue;
            sb.Append(c);
        }
        return sb.ToString();
    }
	
	public static void Main()
	{
		Stopwatch sw = new Stopwatch(); 
		string french = "A Paris, le cortΓ¨ge parisien s’Γ©tait Γ©lancΓ© Γ  14 heures.\r\n\tFace Γ  l’affluence, un «itinΓ©raire bis» a Γ©tΓ© mis en place. D’importants rassemblements ont lieu Γ  Bordeaux, Marseille, Rennes ou Lyon. Suivez la journΓ©e avec nos journalistes dans toute la France.";
		string ftemp = string.Empty; 
		string german = "ޘ Trump\t\r\nverwechselt KlΓ€gerin Carroll auf Foto mit Ex-Frau – das kΓΆnnte Folgen haben"; 
		string gtemp = string.Empty; 
		Console.WriteLine(french); 
		
		sw.Start();
		ftemp = french.ReducetoASCII(); 
		sw.Stop(); 
		
		Console.WriteLine("Ansi reduced\r\n" + ftemp + " in " + sw.ElapsedTicks); 
		
		sw.Reset(); 
		sw.Start();
		ftemp = french.UnicodeToANSI(); 
		sw.Stop(); 
		
		Console.WriteLine("Proper Normalization\r\n" + ftemp + " in " + sw.ElapsedTicks); 
				
		Console.WriteLine();
		Console.WriteLine();
		Console.WriteLine(german); 
		
		sw.Reset();
		sw.Start();
		gtemp = german.ReducetoASCII(); 
		sw.Stop(); 
		
		Console.WriteLine("Ansi reduced\r\n" + gtemp + " in " + sw.ElapsedTicks); 
		
		sw.Reset(); 
		sw.Start();
		gtemp = german.UnicodeToANSI(); 
		sw.Stop(); 
		
		Console.WriteLine("Proper Normalization\r\n" + gtemp + " in " + sw.ElapsedTicks); 
		
	}
}

Thursday, August 1, 2019

LibreOffice falls victim to common PDF attack - malicious macros

LibreOffice Macro Malware

You think open-source developers at LibreOffice would have learned the lessons from it's paid brethren, but 
LibreOffice has the same vulnerability as Word Marcos and PDF Macros that plagued the universe for the last 10 yrs. 














Microsoft has cleaned up its act for the most part, but Adobe Acrobat Reader is still vulnerable to PDF with malware macros. https://www.microsoft.com/security/blog/2017/01/26/phishers-unleash-simple-but-effective-social-engineering-techniques-using-pdf-attachments/

List of Current PDF Exploits : http://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=pdf

LibreOffice was a fork of OpenOffice.org and is built on the original OpenOffice.org code base, and that is also subject to malicious attacks.

LibreOffice Malware Proof of Concept (POC)

LibreOffice  is shipped by default with LibreLogo, a macro to programmable move a turtle vector graphic. To move the turtle, LibreLogo executes custom script code that is internally translated to python code and executed. The big problem here is that the code in not translated well and just supplying python code as the script code often results in the same code after translation.





















This has an official code execution vulnerability number CVE-2019-9848.



Here's a PDF Macro Malware POC with code