Tuesday, June 18, 2019

C Sharp - Push & Pop String Array

Here's a example of push and pop of string[] array in C Sharp, with timings. Surprising hard to find, because most solutions default to Queue queue = newQueue(); 



 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
using System;using System.Diagnostics;using System.Linq;public static class Program 
{
 public static void PushStringCopyArray(ref string[] array, string pushvalue) {
  
        string[] temp = new string[array.Length];
        temp[0] = pushvalue;

        Array.Copy(array, 0, temp, 1, array.Length - 1);

        array = temp;
 }

 public static void PushStringArray(ref string[] array, string pushvalue) {
  
        int newLength = array.Length;
        string[] temp = new string[array.Length];

        temp[0] = pushvalue; //push new value on top of array

        for (int i = 0; i < array.Length - 1; i++)
        temp[i + 1] = array[i];

        array = temp;

 }
 /// <summary>
 /// Pop value from top of string[] ref array
 /// </summary>
 public static string PopStringArray(ref string[] array) {
  
        int newLength = array.Length;
        //string[] temp = new string[array.Length]; //default value is ""
        string[] temp = Enumerable.Repeat("#", array.Length).ToArray(); //set a default value

        string popvalue = array[0]; ////push new value on top of array

        for (int i = array.Length - 1; i >= 1; i--)
        temp[i - 1] = array[i];

        array = temp;

        return popvalue;
 }

 public static void PrintArray(ref string[] array) {
        for (int i = 0; i < array.Length; i++)
        Console.Write("a[" + i + "]=" + array[i] + ",");
        Console.WriteLine();
        Console.WriteLine();
 }

public static void Main() {
    Stopwatch sw = new Stopwatch();
    string[] test = new string[] {"z","a","b","c"};

    Console.WriteLine("Pop/Push Array Test");
    Console.WriteLine("Arr Len=" + test.Length);
    PrintArray(ref test);
    PushStringArray(ref test, "1st");
    PrintArray(ref test);
    PushStringArray(ref test, "2nd");
    PrintArray(ref test);
    PushStringArray(ref test, "3rd");
    PrintArray(ref test);

    sw.Start();
    PushStringArray(ref test, "4th");
    sw.Stop();
    Console.WriteLine(sw.ElapsedTicks + " ticks. WOW! For loop, but still O(n).");
    PrintArray(ref test);

    sw.Start();
    PushStringCopyArray(ref test, "5th");
    sw.Stop();
    Console.WriteLine(sw.ElapsedTicks + " ticks. Array.Copy:  I thought this would be better.");
    PrintArray(ref test);

    Console.WriteLine(PopStringArray(ref test));
    PrintArray(ref test);
    Console.WriteLine(PopStringArray(ref test));
    PrintArray(ref test);
    Console.WriteLine(PopStringArray(ref test));
    PrintArray(ref test);
    Console.WriteLine(PopStringArray(ref test));
    PrintArray(ref test);
    Console.WriteLine(PopStringArray(ref test));
    PrintArray(ref test);
}
}

Monday, June 17, 2019

How to set-up Tidy to properly read HTML pages and encodings

After many hours of painstaking analysis, it turns out https://github.com/markbeaton/TidyManaged package cannot read a HTML string correctly.

It could be an implementation error of the Tidymanaged library or a subtle fact about .NET  4.0 Framework, strings are actually stored as UTF-16 and the conversion screws up HTML entities in particular. This was supposed to be fixed in this version of the framework and above, but perhaps persists.

So say cleanHTML contains HTML from a webpage, which was encoded in UTF-8. 

Well turns out, this gets incorrectly formatted when pushed into this method; 

TidyManaged.Document.FromString(cleanHtml);

You have to use a MemoryStream to feed UFT-8 encoding properly.

TidyManaged.Document.FromStream(HTMLinput);

 See code below to properly handle HTML documents for tidy in dotNet.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
            
            //After many hours of painstaking analysis, this how to get HTML to be passed correctly 
            byte[] encodedBytesUTF8 = Encoding.UTF8.GetBytes(htmlText);
            MemoryStream HTMLinput = new MemoryStream(encodedBytesUTF8);
            
            try
            {
                //THIS DOES NOT WORK, it re-encodes the string improperly, after many hours of painstaking anaylsis
                //tidydoc = TidyManaged.Document.FromString(cleanHtml); 
                
                tidydoc = TidyManaged.Document.FromStream(HTMLinput);
            }
            catch (Exception tex)
            {
                return "tidy HTML parsing error: " + tex.Message;
            }
           
            using ( tidydoc )
            {
                //http://api.html-tidy.org/tidy/quickref_5.0.0.html
                
                tidydoc.ShowWarnings = false;
                tidydoc.Quiet = true;
                tidydoc.ForceOutput = true;
                tidydoc.OutputBodyOnly = AutoBool.Auto;

                tidydoc.DocType = TidyManaged.DocTypeMode.Omit;
                tidydoc.DropFontTags = false;
                tidydoc.UseLogicalEmphasis = false;
                tidydoc.LowerCaseLiterals = false;
                
                tidydoc.OutputXhtml = false;
                tidydoc.OutputXml = false;
                
                tidydoc.MakeClean = false;
                
                tidydoc.DropEmptyParagraphs = false;
                tidydoc.CleanWord2000 = false;

                tidydoc.QuoteAmpersands = false; //This option specifies if Tidy should output unadorned & characters as &amp;.
                tidydoc.AsciiEntities = false;   //Can be used to modify behavior of -c (--clean yes) option. If set to "yes" when using -c, &emdash;, &rdquo;, and other named character entities are downgraded to their closest ascii equivalents.
                tidydoc.PreserveEntities = true; //This option specifies if Tidy should preserve the well-formed entities as found in the input.
                tidydoc.OutputNumericEntities = true; //This option specifies if Tidy should output entities other than the built-in HTML entities (&amp;, &lt;, &gt; and &quot;) in the numeric rather than the named entity form

                tidydoc.JoinStyles = false;
                tidydoc.JoinClasses = false;
                
                tidydoc.Markup = true; //prettify open

                tidydoc.WrapAt = 0;
                tidydoc.IndentSpaces = 4;
                tidydoc.IndentBlockElements = TidyManaged.AutoBool.Yes; // this increases file size! (but makes it better to read)


                tidydoc.InputCharacterEncoding = TidyManaged.EncodingType.Utf8;
                tidydoc.CharacterEncoding = TidyManaged.EncodingType.Utf8; //For raw, Tidy will output values above 127 without translating them into entities.
                tidydoc.OutputCharacterEncoding = TidyManaged.EncodingType.Utf8;

                
                tidydoc.JoinStyles = false;
                tidydoc.MergeDivs = AutoBool.No;
                tidydoc.MergeSpans = AutoBool.No;
                tidydoc.OutputHtml = true;
                

                tidydoc.UseXmlParser = false;
                tidydoc.AddTidyMetaElement = false;
                
                
                try
                {
                    tidydoc.CleanAndRepair(); //required
                }
                catch (Exception car)
                {
                    return "tidy HTML clean && repair error: " + car.Message;
                }
                try
                {
                    cleanHtml = tidydoc.Save();
                }
                catch (Exception save)
                {
                    return "tidy HTML save error: " + save.Message;
                }
            }

Friday, June 14, 2019

Why hackers want to get into your computer, mobile device


  • Clients are always tell me I don't need antivirus software? (Especially, Mac, FYI YTD Macs has 2209 vulnerabilities vs Windows at 900)
  • I have nothing to hide on computer or mobile device? 
  • So why do hackers want to hack my computer so badly? 


This is the same argument used by anti-vaccinators, I don't need it, I never get sick, I am strong enough. 

Well you maybe the unwitting the carrier of this virus, and the spreader of the disease to your closes family and friends. Now you have become the infection. And most likely you won't even know it, because symptoms are not showing. Active carriers who do not present signs or symptoms of disease despite infection are called asymptomatic carriers, in medical parlance.


In computer parlance, asymptomatic carriers are called zombies. 

Most of the times hackers want, you personal information, but there's another reason. They want to be able to control your computer to do things. Get enough of them together, and you can carry out attacks. An army of zombie computers can be used in DDos attacks. DDos attacks is basically flooding a service or website at the same time with millions of request, overloading that sevice or site. Each zombie computer has to come from somewhere, and guess what if you don't have antivirus you are one of them. Mac users!

Here's a real world consequence of your zombie machine; 

From SlashDot, takedown of Telegram via a DDos Attack to prevent Hong Kong protest for continued freedoms!

The distributed denial of service attack that 
hit Telegram Wednes day came from China, the secure messaging app's founder said. Pavel Durov's tweet suggested that the country's government may have done it to disrupt protests in Hong Kong. From a report:In a DDoS attack, an online service gets bombarded with traffic from networks of bots, to the point where it's overwhelmed and legit users get frozen out. In an explanation Wednesday, Telegram compared it to an "army of lemmings" jumping the line at a McDonald's and making innumerable "garbage requests." Durov said, "IP addresses coming mostly from China. Historically, all state actor-sized DDoS (200-400 Gb/s of junk) we experienced coincided in time with protests in Hong Kong (coordinated on Telegram). This case was not an exception." Tens of thousands took to Hong Kong's streets to oppose a government plan that'd allow extraditions to mainland China. People are worried that it would bring the semiautonomous former British colony under the Chinese government's thumb. These protesters relied on encrypted messaging services, which let them mask their identities from Chinese authorities, to communicate.