本文目录
首先强调,Join-Object是一个自定义命令,不是Powershell内置命令
问题
怎样将两个对象列表合并成一个单独的列表?或者同样的问题,怎样将两个CSV文件合并成一个单独的文件。两个列表或者两个CSV文件能合并的前提是它们有一些共同的东西,例如共同的信仰。
为什么需要连接?
如何你已经对关系数据库的基本概念有所了解,可以跳过下面的这部分介绍,直接进入文章的结尾查看脚本文件。
我们假设你有一个公司,你想跟踪员工的出勤率,可能需要一个简单的员工个人信息表(Employee),例如:
|
Employee |
ID | Name | 1 | John | 2 | Mark | 3 | Hanna |
ID作为员工编号,具有唯一性,不能重复。为什么需要员工编号,因为显示世界可能会出现两位员工的同名同姓。另外一张表格应当是考勤表(Entrance),包含了员工的编号和上班打卡时间如下。
|
Entrance |
|
| Employee | When |
| 1 | 6/12/2012 08:05:01 AM |
| 1 | 6/13/2012 07:59:12 AM |
| 1 | 6/14/2012 07:49:10 AM |
| 2 | 6/12/2012 10:33:00 AM |
| 2 | 6/13/2012 10:15:00 AM |
| 44 | 2/29/2012 01:00:00 AM |
N
上面表格中的列”When”,包含了日期和时间,和.NET中与数据库中的DateTime对象一样。从这两张表格中,我们可以得到如下信息。
John(员工编号1)似乎上班比较勤快,一般早上8点左右就到了。
Mark(员工编号2)一般喜欢迟到,稍微晚一点才来上班。
员工编号为44的这条记录是什么情况?他是人是鬼,怎么半夜来公司上班。还有就是他怎么没有出现在员工信息表中,难道离职了?作为数据库的设计人员必须提前处理好,不允许有这样的数据存在。
接下来稍微讨论一下,为什么会有两张表。为什么不把员工信息和考勤表存储在一张表格中?假如这样存储了,可能在考勤表中会出现若干相同的员工信息,也就是被称之为的数据冗余。
不同类型的链接
- 内连接
数据库查询:SELECT Employee.Name,Entrance.[When] FROM Employee INNER JOIN Entrance ON Employee.id = Entrance.EmployeeId
结果
Name When John 2012-06-12 08:05:01.000 John 2012-06-13 07:59:12.000 John 2012-06-14 07:49:10.000 Mark 2012-06-12 10:33:00.000 Mark 2012-06-13 10:15:00.000
- 左外连接
数据库查询:select Employee.Name,Entrance.[When] from Employee left outer join entrance on Entrance.id = employee.id
结果
Name When John 2012-06-12 08:05:01.000 John 2012-06-13 07:59:12.000 John 2012-06-14 07:49:10.000 Mark 2012-06-12 10:33:00.000 Mark 2012-06-13 10:15:00.000 Hanna NULL
- 右外连接
数据库查询:select Employee.Name,Entrance.[When] from Employee right outer join entrance on Entrance.id = employee.id
结果
Name When John 2012-06-12 08:05:01.000 John 2012-06-13 07:59:12.000 John 2012-06-14 07:49:10.000 Mark 2012-06-12 10:33:00.000 Mark 2012-06-13 10:15:00.000 NULL 2012-02-29 01:00:00.000
- 全连接
数据库查询:select Employee.Name,Entrance.[When] from Employee full join entrance on Entrance.id = employee.id
结果
Name When John 2012-06-12 08:05:01.000 John 2012-06-13 07:59:12.000 John 2012-06-14 07:49:10.000 Mark 2012-06-12 10:33:00.000 Mark 2012-06-13 10:15:00.000 Hanna NULL NULL 2012-02-29 01:00:00.000
在Windows PowerShell 中连接对象
在Windows PowerShell中并没有内置的命令处理对象连接,所以原作者才会自己写了一个名为Join-Object的命令来演示怎样对内存中的对象列表实现连接,当然这些处理的对象也可以来自其它地方,例如CSV文件中。
假如c:\temp\employee.csv文件内容为:
假如c:\temp\entrance.csv文件内容为:
分别将两个CSV文件导入到内存中保存为$employee 和$entrance ,并查看变量的内容:
PS D:\> $employee = Import-Csv c:\temp\employee.csv PS D:\> $entrance = Import-Csv c:\temp\entrance.csv PS D:\> $employee | ft -AutoSize Id Name -- ---- 1 John 2 Mark 3 Hanna PS D:\> PS D:\> $entrance |ft -AutoSize EmployeeId When ---------- ---- 1 6/12/2012 08:05:01 AM 1 6/13/2012 07:59:12 AM 1 6/14/2012 07:49:10 AM 2 6/12/2012 10:33:00 AM 2 6/13/2012 10:15:00 AM 44 2/29/2012 01:00:00 AM
之所以选择CSV格式的数据,主要是因为可以很直观的展示对象列表中得所有数据。 下面的4个例子演示如何了4中不同方式的连接。
PS D:\> Join-Object -Left $employee -Right $entrance -Where {$args[0].Id -eq $args[1].EmployeeId} -LeftProperties "Name" -RightProperties "When" -Type OnlyIfInBoth
Name When
---- ----
John 6/12/2012 08:05:01 AM
John 6/13/2012 07:59:12 AM
John 6/14/2012 07:49:10 AM
Mark 6/12/2012 10:33:00 AM
Mark 6/13/2012 10:15:00 AM
Join-Object的前两个参数分别指定要连接的两个对象列表。
“Where”参数是连接的条件,$args[0]代表左边列表中的对象,$args[1]代表右边列表中的对象。
参数RightProperties 和LeftProperties 分别代表了输出的新列表中属性的名称。
参数Type是一组枚举名称,包含4个名称AllInLeft, AllInRight, OnlyIfInBoth 和 AllInBoth 我认为这四个名称非常可以容忍非常容易的与SQL中的连接关联起来,但是还是要在下面的表格中给出对应关系。
| Join-Object | SQL |
| AllInLeft | Left Outer |
| AllIInRight | Right Outer |
| OnlyIfInBoth | Inner |
| AllInBoth | Full Outer |
这是一个左连接的例子
PS C:\temp> Join-Object -Left $employee -Right $entrance -Where {$args[0].Id -eq $args[1].EmployeeId} -LeftProperties "Name" -RightProperties "When" -Type AllInLeft
Name When
---- ----
John 6/12/2012 08:05:01 AM
John 6/13/2012 07:59:12 AM
John 6/14/2012 07:49:10 AM
Mark 6/12/2012 10:33:00 AM
Mark 6/13/2012 10:15:00 AM
Hanna
这是一个右连接的例子
PS C:\temp> Join-Object -Left $employee -Right $entrance -Where {$args[0].Id -eq $args[1].EmployeeId} -LeftProperties "Name" -RightProperties "When" -Type AllInRight
Name When
---- ----
John 6/12/2012 08:05:01 AM
John 6/13/2012 07:59:12 AM
John 6/14/2012 07:49:10 AM
Mark 6/12/2012 10:33:00 AM
Mark 6/13/2012 10:15:00 AM
2/29/2012 01:00:00 AM
这是一个全连接的例子
PS C:\temp> Join-Object –Left $employee –Right $entrance –Where {$args[0].Id -eq $args[1].EmployeeId} –LeftProperties "Name" –RightProperties "When" -Type AllInBoth
Name When
---- ----
John 6/12/2012 08:05:01 AM
John 6/13/2012 07:59:12 AM
John 6/14/2012 07:49:10 AM
Mark 6/12/2012 10:33:00 AM
Mark 6/13/2012 10:15:00 AM
Hanna
2/29/2012 01:00:00 AM
连接是一个数据处理时一个很重要的工具,因为绝大多数情况下,多个表格连接在一起,才能反映用户的需求。因为Join-Object非常有用奥! 千呼万唤始出来,下面附上Join-Object命令的内部脚本实现。
Join-Object 脚本
function AddItemProperties($item, $properties, $output)
{
if($item -ne $null)
{
foreach($property in $properties)
{
$propertyHash =$property -as [hashtable]
if($propertyHash -ne $null)
{
$hashName=$propertyHash["name"] -as [string]
if($hashName -eq $null)
{
throw "there should be a string Name"
}
$expression=$propertyHash["expression"] -as [scriptblock]
if($expression -eq $null)
{
throw "there should be a ScriptBlock Expression"
}
$_=$item
$expressionValue=& $expression
$output | add-member -MemberType "NoteProperty" -Name $hashName -Value $expressionValue
}
else
{
# .psobject.Properties allows you to list the properties of any object, also known as "reflection"
foreach($itemProperty in $item.psobject.Properties)
{
if ($itemProperty.Name -like $property)
{
$output | add-member -MemberType "NoteProperty" -Name $itemProperty.Name -Value $itemProperty.Value
}
}
}
}
}
}
function WriteJoinObjectOutput($leftItem, $rightItem, $leftProperties, $rightProperties, $Type)
{
$output = new-object psobject
if($Type -eq "AllInRight")
{
# This mix of rightItem with LeftProperties and vice versa is due to
# the switch of Left and Right arguments for AllInRight
AddItemProperties $rightItem $leftProperties $output
AddItemProperties $leftItem $rightProperties $output
}
else
{
AddItemProperties $leftItem $leftProperties $output
AddItemProperties $rightItem $rightProperties $output
}
$output
}
<#
.Synopsis
Joins two lists of objects
.DESCRIPTION
Joins two lists of objects
.EXAMPLE
Join-Object $a $b "Id" ("Name","Salary")
#>
function Join-Object
{
[CmdletBinding()]
[OutputType([int])]
Param
(
# List to join with $Right
[Parameter(Mandatory=$true,
Position=0)]
[object[]]
$Left,
# List to join with $Left
[Parameter(Mandatory=$true,
Position=1)]
[object[]]
$Right,
# Condition in which an item in the left matches an item in the right
# typically something like: {$args[0].Id -eq $args[1].Id}
[Parameter(Mandatory=$true,
Position=2)]
[scriptblock]
$Where,
# Properties from $Left we want in the output.
# Each property can:
# - Be a plain property name like "Name"
# - Contain wildcards like "*"
# - Be a hashtable like @{Name="Product Name";Expression={$_.Name}}. Name is the output property name
# and Expression is the property value. The same syntax is available in select-object and it is
# important for join-object because joined lists could have a property with the same name
[Parameter(Mandatory=$true,
Position=3)]
[object[]]
$LeftProperties,
# Properties from $Right we want in the output.
# Like LeftProperties, each can be a plain name, wildcard or hashtable. See the LeftProperties comments.
[Parameter(Mandatory=$true,
Position=4)]
[object[]]
$RightProperties,
# Type of join.
# AllInLeft will have all elements from Left at least once in the output, and might appear more than once
# if the where clause is true for more than one element in right, Left elements with matches in Right are
# preceded by elements with no matches. This is equivalent to an outer left join (or simply left join)
# SQL statement.
# AllInRight is similar to AllInLeft.
# OnlyIfInBoth will cause all elements from Left to be placed in the output, only if there is at least one
# match in Right. This is equivalent to a SQL inner join (or simply join) statement.
# AllInBoth will have all entries in right and left in the output. Specifically, it will have all entries
# in right with at least one match in left, followed by all entries in Right with no matches in left,
# followed by all entries in Left with no matches in Right.This is equivallent to a SQL full join.
[Parameter(Mandatory=$false,
Position=5)]
[ValidateSet("AllInLeft","OnlyIfInBoth","AllInBoth", "AllInRight")]
[string]
$Type="OnlyIfInBoth"
)
Begin
{
# a list of the matches in right for each object in left
$leftMatchesInRight = new-object System.Collections.ArrayList
# the count for all matches
$rightMatchesCount = New-Object "object[]" $Right.Count
for($i=0;$i -lt $Right.Count;$i++)
{
$rightMatchesCount[$i]=0
}
}
Process
{
if($Type -eq "AllInRight")
{
# for AllInRight we just switch Left and Right
$aux = $Left
$Left = $Right
$Right = $aux
}
# go over items in $Left and produce the list of matches
foreach($leftItem in $Left)
{
$leftItemMatchesInRight = new-object System.Collections.ArrayList
$null = $leftMatchesInRight.Add($leftItemMatchesInRight)
for($i=0; $i -lt $right.Count;$i++)
{
$rightItem=$right[$i]
if($Type -eq "AllInRight")
{
# For AllInRight, we want $args[0] to refer to the left and $args[1] to refer to right,
# but since we switched left and right, we have to switch the where arguments
$whereLeft = $rightItem
$whereRight = $leftItem
}
else
{
$whereLeft = $leftItem
$whereRight = $rightItem
}
if(Invoke-Command -ScriptBlock $where -ArgumentList $whereLeft,$whereRight)
{
$null = $leftItemMatchesInRight.Add($rightItem)
$rightMatchesCount[$i]++
}
}
}
# go over the list of matches and produce output
for($i=0; $i -lt $left.Count;$i++)
{
$leftItemMatchesInRight=$leftMatchesInRight[$i]
$leftItem=$left[$i]
if($leftItemMatchesInRight.Count -eq 0)
{
if($Type -ne "OnlyIfInBoth")
{
WriteJoinObjectOutput $leftItem $null $LeftProperties $RightProperties $Type
}
continue
}
foreach($leftItemMatchInRight in $leftItemMatchesInRight)
{
WriteJoinObjectOutput $leftItem $leftItemMatchInRight $LeftProperties $RightProperties $Type
}
}
}
End
{
#produce final output for members of right with no matches for the AllInBoth option
if($Type -eq "AllInBoth")
{
for($i=0; $i -lt $right.Count;$i++)
{
$rightMatchCount=$rightMatchesCount[$i]
if($rightMatchCount -eq 0)
{
$rightItem=$Right[$i]
WriteJoinObjectOutput $null $rightItem $LeftProperties $RightProperties $Type
}
}
}
}
}
原文地址:http://blogs.msdn.com/b/powershell/archive/2012/07/13/join-object.aspx
原作者:PowerShell Team
请尊重原作者和编辑的辛勤劳动,欢迎转载,并注明出处!

找了半天都木有此命令,仔细一看原来是你自己写的。建议在标题或者文章打头说明。
PS C:Userstravxie> $employeeemployee id name ———– —- 1 John 2 mark 3 hanna PS C:Userstravxie> $entranceemployeeid when ———- —- 1 6/12/2012 08:05:01 AM 1 6/13/2012 07:59:12 AM 1 6/14/2012 07:49:10 AM 2 6/12/2012 10:33:00 AM 2 6/13/2012 10:15:00 AM 44 2/29/2012 01:00:00 AM PS C:Userstravxie> Join-Object -Left $employee -Right $entrance -Where {$args[0].Id -eq $args
.EmployeeId} -LeftProperties “Name” -RightProperties “When” -Type OnlyIfInBothPS C:Userstravxie> $a=Join-Object -Left $employee -Right $entrance -Where {$args[0].Id -eq $args
.EmployeeId} -LeftProperties “Name” -RightProperties “When” -Type OnlyIfInBothPS C:Userstravxie> $a不懂为何没有输出 没有报错的
网站有点小问题,麻烦你把code重新发一下,放在pre或者code标签中,谢谢!