本文目录
首先强调,Join-Object是一个自定义命令,不是Powershell内置命令
问题
怎样将两个对象列表合并成一个单独的列表?或者同样的问题,怎样将两个CSV文件合并成一个单独的文件。两个列表或者两个CSV文件能合并的前提是它们有一些共同的东西,例如共同的信仰。
为什么需要连接?
如何你已经对关系数据库的基本概念有所了解,可以跳过下面的这部分介绍,直接进入文章的结尾查看脚本文件。
我们假设你有一个公司,你想跟踪员工的出勤率,可能需要一个简单的员工个人信息表(Employee),例如:
Employee |
ID | Name | 1 | John | 2 | Mark | 3 | Hanna |
ID作为员工编号,具有唯一性,不能重复。为什么需要员工编号,因为显示世界可能会出现两位员工的同名同姓。另外一张表格应当是考勤表(Entrance),包含了员工的编号和上班打卡时间如下。
Entrance |
|
Employee | When |
1 | 6/12/2012 08:05:01 AM |
1 | 6/13/2012 07:59:12 AM |
1 | 6/14/2012 07:49:10 AM |
2 | 6/12/2012 10:33:00 AM |
2 | 6/13/2012 10:15:00 AM |
44 | 2/29/2012 01:00:00 AM |
N
上面表格中的列”When”,包含了日期和时间,和.NET中与数据库中的DateTime对象一样。从这两张表格中,我们可以得到如下信息。
John(员工编号1)似乎上班比较勤快,一般早上8点左右就到了。
Mark(员工编号2)一般喜欢迟到,稍微晚一点才来上班。
员工编号为44的这条记录是什么情况?他是人是鬼,怎么半夜来公司上班。还有就是他怎么没有出现在员工信息表中,难道离职了?作为数据库的设计人员必须提前处理好,不允许有这样的数据存在。
接下来稍微讨论一下,为什么会有两张表。为什么不把员工信息和考勤表存储在一张表格中?假如这样存储了,可能在考勤表中会出现若干相同的员工信息,也就是被称之为的数据冗余。
不同类型的链接
- 内连接
数据库查询:SELECT Employee.Name,Entrance.[When] FROM Employee INNER JOIN Entrance ON Employee.id = Entrance.EmployeeId
结果
Name When John 2012-06-12 08:05:01.000 John 2012-06-13 07:59:12.000 John 2012-06-14 07:49:10.000 Mark 2012-06-12 10:33:00.000 Mark 2012-06-13 10:15:00.000
- 左外连接
数据库查询:select Employee.Name,Entrance.[When] from Employee left outer join entrance on Entrance.id = employee.id
结果
Name When John 2012-06-12 08:05:01.000 John 2012-06-13 07:59:12.000 John 2012-06-14 07:49:10.000 Mark 2012-06-12 10:33:00.000 Mark 2012-06-13 10:15:00.000 Hanna NULL
- 右外连接
数据库查询:select Employee.Name,Entrance.[When] from Employee right outer join entrance on Entrance.id = employee.id
结果
Name When John 2012-06-12 08:05:01.000 John 2012-06-13 07:59:12.000 John 2012-06-14 07:49:10.000 Mark 2012-06-12 10:33:00.000 Mark 2012-06-13 10:15:00.000 NULL 2012-02-29 01:00:00.000
- 全连接
数据库查询:select Employee.Name,Entrance.[When] from Employee full join entrance on Entrance.id = employee.id
结果
Name When John 2012-06-12 08:05:01.000 John 2012-06-13 07:59:12.000 John 2012-06-14 07:49:10.000 Mark 2012-06-12 10:33:00.000 Mark 2012-06-13 10:15:00.000 Hanna NULL NULL 2012-02-29 01:00:00.000
在Windows PowerShell 中连接对象
在Windows PowerShell中并没有内置的命令处理对象连接,所以原作者才会自己写了一个名为Join-Object的命令来演示怎样对内存中的对象列表实现连接,当然这些处理的对象也可以来自其它地方,例如CSV文件中。
假如c:\temp\employee.csv文件内容为:
假如c:\temp\entrance.csv文件内容为:
分别将两个CSV文件导入到内存中保存为$employee 和$entrance ,并查看变量的内容:
PS D:\> $employee = Import-Csv c:\temp\employee.csv PS D:\> $entrance = Import-Csv c:\temp\entrance.csv PS D:\> $employee | ft -AutoSize Id Name -- ---- 1 John 2 Mark 3 Hanna PS D:\> PS D:\> $entrance |ft -AutoSize EmployeeId When ---------- ---- 1 6/12/2012 08:05:01 AM 1 6/13/2012 07:59:12 AM 1 6/14/2012 07:49:10 AM 2 6/12/2012 10:33:00 AM 2 6/13/2012 10:15:00 AM 44 2/29/2012 01:00:00 AM
之所以选择CSV格式的数据,主要是因为可以很直观的展示对象列表中得所有数据。 下面的4个例子演示如何了4中不同方式的连接。
PS D:\> Join-Object -Left $employee -Right $entrance -Where {$args[0].Id -eq $args[1].EmployeeId} -LeftProperties "Name" -RightProperties "When" -Type OnlyIfInBoth Name When ---- ---- John 6/12/2012 08:05:01 AM John 6/13/2012 07:59:12 AM John 6/14/2012 07:49:10 AM Mark 6/12/2012 10:33:00 AM Mark 6/13/2012 10:15:00 AM
Join-Object的前两个参数分别指定要连接的两个对象列表。
“Where”参数是连接的条件,$args[0]代表左边列表中的对象,$args[1]代表右边列表中的对象。
参数RightProperties 和LeftProperties 分别代表了输出的新列表中属性的名称。
参数Type是一组枚举名称,包含4个名称AllInLeft, AllInRight, OnlyIfInBoth 和 AllInBoth 我认为这四个名称非常可以容忍非常容易的与SQL中的连接关联起来,但是还是要在下面的表格中给出对应关系。
Join-Object | SQL |
AllInLeft | Left Outer |
AllIInRight | Right Outer |
OnlyIfInBoth | Inner |
AllInBoth | Full Outer |
这是一个左连接的例子
PS C:\temp> Join-Object -Left $employee -Right $entrance -Where {$args[0].Id -eq $args[1].EmployeeId} -LeftProperties "Name" -RightProperties "When" -Type AllInLeft Name When ---- ---- John 6/12/2012 08:05:01 AM John 6/13/2012 07:59:12 AM John 6/14/2012 07:49:10 AM Mark 6/12/2012 10:33:00 AM Mark 6/13/2012 10:15:00 AM Hanna
这是一个右连接的例子
PS C:\temp> Join-Object -Left $employee -Right $entrance -Where {$args[0].Id -eq $args[1].EmployeeId} -LeftProperties "Name" -RightProperties "When" -Type AllInRight Name When ---- ---- John 6/12/2012 08:05:01 AM John 6/13/2012 07:59:12 AM John 6/14/2012 07:49:10 AM Mark 6/12/2012 10:33:00 AM Mark 6/13/2012 10:15:00 AM 2/29/2012 01:00:00 AM
这是一个全连接的例子
PS C:\temp> Join-Object –Left $employee –Right $entrance –Where {$args[0].Id -eq $args[1].EmployeeId} –LeftProperties "Name" –RightProperties "When" -Type AllInBoth Name When ---- ---- John 6/12/2012 08:05:01 AM John 6/13/2012 07:59:12 AM John 6/14/2012 07:49:10 AM Mark 6/12/2012 10:33:00 AM Mark 6/13/2012 10:15:00 AM Hanna 2/29/2012 01:00:00 AM
连接是一个数据处理时一个很重要的工具,因为绝大多数情况下,多个表格连接在一起,才能反映用户的需求。因为Join-Object非常有用奥! 千呼万唤始出来,下面附上Join-Object命令的内部脚本实现。
Join-Object 脚本
function AddItemProperties($item, $properties, $output) { if($item -ne $null) { foreach($property in $properties) { $propertyHash =$property -as [hashtable] if($propertyHash -ne $null) { $hashName=$propertyHash["name"] -as [string] if($hashName -eq $null) { throw "there should be a string Name" } $expression=$propertyHash["expression"] -as [scriptblock] if($expression -eq $null) { throw "there should be a ScriptBlock Expression" } $_=$item $expressionValue=& $expression $output | add-member -MemberType "NoteProperty" -Name $hashName -Value $expressionValue } else { # .psobject.Properties allows you to list the properties of any object, also known as "reflection" foreach($itemProperty in $item.psobject.Properties) { if ($itemProperty.Name -like $property) { $output | add-member -MemberType "NoteProperty" -Name $itemProperty.Name -Value $itemProperty.Value } } } } } } function WriteJoinObjectOutput($leftItem, $rightItem, $leftProperties, $rightProperties, $Type) { $output = new-object psobject if($Type -eq "AllInRight") { # This mix of rightItem with LeftProperties and vice versa is due to # the switch of Left and Right arguments for AllInRight AddItemProperties $rightItem $leftProperties $output AddItemProperties $leftItem $rightProperties $output } else { AddItemProperties $leftItem $leftProperties $output AddItemProperties $rightItem $rightProperties $output } $output } <# .Synopsis Joins two lists of objects .DESCRIPTION Joins two lists of objects .EXAMPLE Join-Object $a $b "Id" ("Name","Salary") #> function Join-Object { [CmdletBinding()] [OutputType([int])] Param ( # List to join with $Right [Parameter(Mandatory=$true, Position=0)] [object[]] $Left, # List to join with $Left [Parameter(Mandatory=$true, Position=1)] [object[]] $Right, # Condition in which an item in the left matches an item in the right # typically something like: {$args[0].Id -eq $args[1].Id} [Parameter(Mandatory=$true, Position=2)] [scriptblock] $Where, # Properties from $Left we want in the output. # Each property can: # - Be a plain property name like "Name" # - Contain wildcards like "*" # - Be a hashtable like @{Name="Product Name";Expression={$_.Name}}. Name is the output property name # and Expression is the property value. The same syntax is available in select-object and it is # important for join-object because joined lists could have a property with the same name [Parameter(Mandatory=$true, Position=3)] [object[]] $LeftProperties, # Properties from $Right we want in the output. # Like LeftProperties, each can be a plain name, wildcard or hashtable. See the LeftProperties comments. [Parameter(Mandatory=$true, Position=4)] [object[]] $RightProperties, # Type of join. # AllInLeft will have all elements from Left at least once in the output, and might appear more than once # if the where clause is true for more than one element in right, Left elements with matches in Right are # preceded by elements with no matches. This is equivalent to an outer left join (or simply left join) # SQL statement. # AllInRight is similar to AllInLeft. # OnlyIfInBoth will cause all elements from Left to be placed in the output, only if there is at least one # match in Right. This is equivalent to a SQL inner join (or simply join) statement. # AllInBoth will have all entries in right and left in the output. Specifically, it will have all entries # in right with at least one match in left, followed by all entries in Right with no matches in left, # followed by all entries in Left with no matches in Right.This is equivallent to a SQL full join. [Parameter(Mandatory=$false, Position=5)] [ValidateSet("AllInLeft","OnlyIfInBoth","AllInBoth", "AllInRight")] [string] $Type="OnlyIfInBoth" ) Begin { # a list of the matches in right for each object in left $leftMatchesInRight = new-object System.Collections.ArrayList # the count for all matches $rightMatchesCount = New-Object "object[]" $Right.Count for($i=0;$i -lt $Right.Count;$i++) { $rightMatchesCount[$i]=0 } } Process { if($Type -eq "AllInRight") { # for AllInRight we just switch Left and Right $aux = $Left $Left = $Right $Right = $aux } # go over items in $Left and produce the list of matches foreach($leftItem in $Left) { $leftItemMatchesInRight = new-object System.Collections.ArrayList $null = $leftMatchesInRight.Add($leftItemMatchesInRight) for($i=0; $i -lt $right.Count;$i++) { $rightItem=$right[$i] if($Type -eq "AllInRight") { # For AllInRight, we want $args[0] to refer to the left and $args[1] to refer to right, # but since we switched left and right, we have to switch the where arguments $whereLeft = $rightItem $whereRight = $leftItem } else { $whereLeft = $leftItem $whereRight = $rightItem } if(Invoke-Command -ScriptBlock $where -ArgumentList $whereLeft,$whereRight) { $null = $leftItemMatchesInRight.Add($rightItem) $rightMatchesCount[$i]++ } } } # go over the list of matches and produce output for($i=0; $i -lt $left.Count;$i++) { $leftItemMatchesInRight=$leftMatchesInRight[$i] $leftItem=$left[$i] if($leftItemMatchesInRight.Count -eq 0) { if($Type -ne "OnlyIfInBoth") { WriteJoinObjectOutput $leftItem $null $LeftProperties $RightProperties $Type } continue } foreach($leftItemMatchInRight in $leftItemMatchesInRight) { WriteJoinObjectOutput $leftItem $leftItemMatchInRight $LeftProperties $RightProperties $Type } } } End { #produce final output for members of right with no matches for the AllInBoth option if($Type -eq "AllInBoth") { for($i=0; $i -lt $right.Count;$i++) { $rightMatchCount=$rightMatchesCount[$i] if($rightMatchCount -eq 0) { $rightItem=$Right[$i] WriteJoinObjectOutput $null $rightItem $LeftProperties $RightProperties $Type } } } } }
原文地址:http://blogs.msdn.com/b/powershell/archive/2012/07/13/join-object.aspx
原作者:PowerShell Team
请尊重原作者和编辑的辛勤劳动,欢迎转载,并注明出处!
找了半天都木有此命令,仔细一看原来是你自己写的。建议在标题或者文章打头说明。
PS C:Userstravxie> $employeeemployee id name ———– —- 1 John 2 mark 3 hanna PS C:Userstravxie> $entranceemployeeid when ———- —- 1 6/12/2012 08:05:01 AM 1 6/13/2012 07:59:12 AM 1 6/14/2012 07:49:10 AM 2 6/12/2012 10:33:00 AM 2 6/13/2012 10:15:00 AM 44 2/29/2012 01:00:00 AM PS C:Userstravxie> Join-Object -Left $employee -Right $entrance -Where {$args[0].Id -eq $args
.EmployeeId} -LeftProperties “Name” -RightProperties “When” -Type OnlyIfInBothPS C:Userstravxie> $a=Join-Object -Left $employee -Right $entrance -Where {$args[0].Id -eq $args
.EmployeeId} -LeftProperties “Name” -RightProperties “When” -Type OnlyIfInBothPS C:Userstravxie> $a不懂为何没有输出 没有报错的
网站有点小问题,麻烦你把code重新发一下,放在pre或者code标签中,谢谢!