Powershell performance problems
I found it by hard way like many others before me. Powershell has it’s performance limitations and most of them are real traps waiting for newcomers like me. I quickly developed script for checking couple of things in our logs. Test log’s size was up to tens of MB and scripts run minutes. Ok - it was something you can live with. But then I needed to process logs of size up to hundreds MB and run time jumped over couple of hours. I consulted uncle google and in my case slowness was caused by 3 factors:
- by how file was read
- in some cases by use += operator to extend array
- by using function
reading file
Get-Content cmdlet - it is usually recomended to not use for big files
measure-command { (Get-Content -path .\Test.log) | % { write-output $_ } } | Select-Object -Property TotalSeconds
TotalSeconds
------------
3,9129402Get-Content cmdlet with raw parameter and split by line ends - need to fit in memory
measure-command { (Get-Content -path .\Test.log -raw) -split "`n" | % { write-output $_ } } | Select-Object -Property TotalSeconds
TotalSeconds
------------
3,1242313use StreamReader based on PowerShell scripting performance considerations
measure-command {
>> try
>> {
>> $stream = [System.IO.StreamReader]::new('C:\Test.log')
>> while ($line = $stream.ReadLine()) { write-output $_ }
>> }
>> finally
>> {
>> $stream.Dispose()
>> }
>> } | Select-Object -Property TotalSeconds
TotalSeconds
------------
2,8871127and winner - variant of switch command I didn’t know existed
measure-command { switch -file .\Test.log { default { write-output $_ } } } | Select-Object -Property TotalSeconds
TotalSeconds
------------
2,4991266extend array
Array is immutable in .net so when += operator is used new array is created. So it is clear that use of array for storing more values is performance killer.
Use array
measure-command {
>> $arr = @()
>> $i = 0
>> do { $i++; $arr += $i} while ( $i -lt 10000)
>> } | Select-Object -Property TotalSeconds
TotalSeconds
------------
2,1282633Use array list
measure-command {
>> $arr = [System.Collections.ArrayList]::new()
>> $i = 0
>> do { $i++; $arr.Add($i)} while ( $i -lt 10000)
>> } | Select-Object -Property TotalSeconds
>>
TotalSeconds
------------
0,0449895using function
It is probably this function call overhead bug. When script is supposed to run against big number of data one can manually inline code of most time consuming functions. But it is far from ideal.
with calling function
measure-command {
>> function Test {
>> param( [int]$i )
>> $i * 2
>> }
>>
>> $i = 0
>> $t = 0
>> do { $i++; $t += (Test $i)} while ( $i -lt 10000)
>> } | Select-Object -Property TotalSeconds
>>
TotalSeconds
------------
0,2889852with calling code directly
measure-command {
>> $i = 0
>> $t = 0
>> do { $i++; $t += ($i * 2)} while ( $i -lt 10000)
>> } | Select-Object -Property TotalSeconds
>>
TotalSeconds
------------
0,027523There is a bit more things to pay attention to on PowerShell scripting performance considerations.