参考:

  1. https://zhuanlan.zhihu.com/p/404946454
  2. https://zhuanlan.zhihu.com/p/67055774
  3. https://zhuanlan.zhihu.com/p/401956734

前言

由于 C++ 本身没有 GC 机制(日常鞭尸),因此 UE 便在 UObject 的基础上自己造了一套轮子,这里直接贴上 南京周润发 的回答:

UE4为我们搭建了一套UObject对象系统,并且加入了垃圾回收机制,使我们用C++进行游戏开发时更加方便,而且游戏本身也可以极大程度的避免了内存泄漏问题。

UE4采用了标记-清扫垃圾回收方式,是一种经典的垃圾回收方式。一次垃圾回收分为两个阶段。第一阶段从一个根集合出发,遍历所有可达对象,遍历完成后就能标记出可达对象和不可达对象了,这个阶段会在一帧内完成。第二阶段会渐进式的清理这些不可达对象,因为不可达的对象将永远不能被访问到,所以可以分帧清理它们,避免一下子清理很多UObject,比如map卸载时,发生明显的卡顿。

GC发生在游戏线程上,对UObject进行清理,支持多线程GC。

关于 GC 的内容,有许多博客以及解析得十分详细,尤其是 UE4 垃圾回收GC 源码解读 这篇,完全是源码剖析的程度。本文不打算深入到这种程度,只求能够大致理解 UE 中 GC 的流程与相关的类结构。

本文所使用的 UE 版本依然为 5.0.3

Garbage Collection

UE 的 GC 分手动调用与自动调用两种,其中:

  • 手动调用:即使用 UWorld::ForceGarbageCollection( bool bFullPurge),它会在 World.tick 的下一帧强行进行垃圾回收。

  • 自动调用:系统会根据默认的设置(可重新配置)一定的间隔时间或者条件下,自动调用垃圾回收。对GC可以设置若干参数,比如MaxObjectsInGame,规定了游戏中最大存在的UObject对象(对编辑器不生效),移动平台上默认设置了131072,当UObject数量超过这个阈值时,游戏会崩溃,其他详细参数可见UGarbageCollectionSettingsGarbageCollection.cppUnrealEngine.cpp中相关的属性。

GC 的入口

GC 的入口为 UObject/GarbageCollection.cpp 中定义的 CollectGarbage() 函数,该函数包括三个部分:获取GC锁,执行CollectGarbageInternal(),释放GC锁。

因为GC是多线程的,因此要设置GC锁,防止其他线程做UObject相关操作,会与GC冲突,这主要用于保护异步加载过程。

一个作用为防止一个对象被加载后,存储的变量还没来得及添加引用,就被当作不可达垃圾回收掉了。

1
2
3
4
5
6
7
8
9
10
11
void CollectGarbage(EObjectFlags KeepFlags, bool bPerformFullPurge)
{
// No other thread may be performing UObject operations while we're running
AcquireGCLock();

// Perform actual garbage collection
CollectGarbageInternal(KeepFlags, bPerformFullPurge);

// Other threads are free to use UObjects
ReleaseGCLock();
}

TODO: 关于锁的分析

在向下分析之前需要先了解一个 Flag 类 EInternalObjectFlags,该类定义于 UObject/ObjectMacros.h 中,这个类主要用于给一个对象打上各种标记,实现对于其是否为 GarbageUnreachableRootSet 等条件的快速查询。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
/** Objects flags for internal use (GC, low level UObject code) */
enum class EInternalObjectFlags : int32
{
None = 0,

LoaderImport = 1 << 20, ///< Object is ready to be imported by another package during loading
Garbage = 1 << 21, ///< Garbage from logical point of view and should not be referenced. This flag is mirrored in EObjectFlags as RF_Garbage for performance
PersistentGarbage = 1 << 22, ///< Same as above but referenced through a persistent reference so it can't be GC'd
ReachableInCluster = 1 << 23, ///< External reference to object in cluster exists
ClusterRoot = 1 << 24, ///< Root of a cluster
Native = 1 << 25, ///< Native (UClass only).
Async = 1 << 26, ///< Object exists only on a different thread than the game thread.
AsyncLoading = 1 << 27, ///< Object is being asynchronously loaded.
Unreachable = 1 << 28, ///< Object is not reachable on the object graph.
PendingKill UE_DEPRECATED(5.0, "PendingKill flag should no longer be used. Use Garbage flag instead.") = 1 << 29, ///< Objects that are pending destruction (invalid for gameplay but valid objects). This flag is mirrored in EObjectFlags as RF_PendingKill for performance
RootSet = 1 << 30, ///< Object will not be garbage collected, even if unreferenced.
PendingConstruction = 1 << 31, ///< Object didn't have its class constructor called yet (only the UObjectBase one to initialize its most basic members)

GarbageCollectionKeepFlags = Native | Async | AsyncLoading | LoaderImport,
PRAGMA_DISABLE_DEPRECATION_WARNINGS
MirroredFlags = Garbage | PendingKill, /// Flags mirrored in EObjectFlags

//~ Make sure this is up to date!
AllFlags = LoaderImport | Garbage | PersistentGarbage | ReachableInCluster | ClusterRoot | Native | Async | AsyncLoading | Unreachable | PendingKill | RootSet | PendingConstruction
PRAGMA_ENABLE_DEPRECATION_WARNINGS
};

CollectGarbageInternal

上文中说到,CollectGarbage() 内部就是调用了 CollectGarbageInternal() 来完成真正的 GC 工作,让我们来分析这个函数的实现。

该函数接受两个参数,KeepFlags 用于指定一些标记,带有这些标记的对象无论是否被引用都不会被 GC;bPerformFullPurge 用于指定在标记阶段结束后是否进行一次完全清除。

除去与我们分析的主线不太相关的代码,该函数中剩余部分的代码如下所示:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
/** 
* Deletes all unreferenced objects, keeping objects that have any of the passed in KeepFlags set
*
* @param KeepFlags objects with those flags will be kept regardless of being referenced or not
* @param bPerformFullPurge if true, perform a full purge after the mark pass
*/
void CollectGarbageInternal(EObjectFlags KeepFlags, bool bPerformFullPurge)
{
// ...
// Perform reachability analysis.
{
const double StartTime = FPlatformTime::Seconds();
FRealtimeGC TagUsedRealtimeGC;

TagUsedRealtimeGC.PerformReachabilityAnalysis(KeepFlags, Options);
UE_LOG(LogGarbage, Log, TEXT("%f ms for GC"), (FPlatformTime::Seconds() - StartTime) * 1000);
}

FGCArrayPool& ArrayPool = FGCArrayPool::Get();
TArray<FGCArrayStruct*> AllArrays;
ArrayPool.GetAllArrayStructsFromPool(AllArrays);
// This needs to happen before clusters get dissolved otherwisise cluster information will be missing from history
ArrayPool.UpdateGCHistory(AllArrays);

// ...

{
ArrayPool.DumpGarbageReferencers(AllArrays);

GatherUnreachableObjects(!(Options & EFastReferenceCollectorOptions::Parallel));
NotifyUnreachableObjects(GUnreachableObjects);

// This needs to happen after NotifyGarbageReferencers and GatherUnreachableObjects since both can mark more objects as unreachable
ArrayPool.ClearWeakReferences(AllArrays);

// Now return arrays back to the pool and free some memory if requested
for (int32 Index = 0; Index < AllArrays.Num(); ++Index)
{
FGCArrayStruct* ArrayStruct = AllArrays[Index];
if (bPerformFullPurge
|| Index % 7 == 3) // delete 1/7th of them just to keep things from growing too much between full purges
{
ArrayPool.FreeArrayStruct(ArrayStruct);
}
else
{
ArrayPool.ReturnToPool(ArrayStruct);
}
}

// Make sure nothing will be using potentially freed arrays
AllArrays.Empty();

if (bPerformFullPurge || !GIncrementalBeginDestroyEnabled)
{
UnhashUnreachableObjects(/**bUseTimeLimit = */ false);
FScopedCBDProfile::DumpProfile();
}
}

// ...
}

可以看出,该函数总体上就是执行了两个阶段的任务:标记、清扫。我们分别来看。

PerformReachabilityAnalysis

CollectGarbageInternal() 函数中,与标记相关的代码是这部分:

1
2
3
4
5
6
7
8
// Perform reachability analysis.
{
const double StartTime = FPlatformTime::Seconds();
FRealtimeGC TagUsedRealtimeGC;

TagUsedRealtimeGC.PerformReachabilityAnalysis(KeepFlags, Options);
UE_LOG(LogGarbage, Log, TEXT("%f ms for GC"), (FPlatformTime::Seconds() - StartTime) * 1000);
}

可以看出,该部分的主要逻辑又是由另一个函数完成的,即 PerformReachabilityAnalysis(),继续分析这个函数:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
/**
* Performs reachability analysis.
*
* @param KeepFlags Objects with these flags will be kept regardless of being referenced or not
*/
void PerformReachabilityAnalysis(EObjectFlags KeepFlags, const EFastReferenceCollectorOptions InOptions)
{
LLM_SCOPE(ELLMTag::GC);

SCOPED_NAMED_EVENT(FRealtimeGC_PerformReachabilityAnalysis, FColor::Red);
DECLARE_SCOPE_CYCLE_COUNTER(TEXT("FRealtimeGC::PerformReachabilityAnalysis"), STAT_FArchiveRealtimeGC_PerformReachabilityAnalysis, STATGROUP_GC);

/** Growing array of objects that require serialization */
FGCArrayStruct* ArrayStruct = FGCArrayPool::Get().GetArrayStructFromPool();
TArray<UObject*>& ObjectsToSerialize = ArrayStruct->ObjectsToSerialize;

// Reset object count.
GObjectCountDuringLastMarkPhase.Reset();

// Make sure GC referencer object is checked for references to other objects even if it resides in permanent object pool
if (FPlatformProperties::RequiresCookedData() && FGCObject::GGCObjectReferencer && GUObjectArray.IsDisregardForGC(FGCObject::GGCObjectReferencer))
{
ObjectsToSerialize.Add(FGCObject::GGCObjectReferencer);
}

{
const double StartTime = FPlatformTime::Seconds();
// Mark phase doesn't care about PendingKill being enabled or not so there's just fewer compiled in functions
const EFastReferenceCollectorOptions OptionsForMarkPhase = InOptions & ~EFastReferenceCollectorOptions::WithPendingKill;
(this->*MarkObjectsFunctions[GetGCFunctionIndex(OptionsForMarkPhase)])(ObjectsToSerialize, KeepFlags);
UE_LOG(LogGarbage, Verbose, TEXT("%f ms for MarkObjectsAsUnreachable Phase (%d Objects To Serialize)"), (FPlatformTime::Seconds() - StartTime) * 1000, ObjectsToSerialize.Num());
}

{
const double StartTime = FPlatformTime::Seconds();
PerformReachabilityAnalysisOnObjects(ArrayStruct, InOptions);
UE_LOG(LogGarbage, Verbose, TEXT("%f ms for Reachability Analysis"), (FPlatformTime::Seconds() - StartTime) * 1000);
}

// Allowing external systems to add object roots. This can't be done through AddReferencedObjects
// because it may require tracing objects (via FGarbageCollectionTracer) multiple times
FCoreUObjectDelegates::TraceExternalRootsForReachabilityAnalysis.Broadcast(*this, KeepFlags, !(InOptions & EFastReferenceCollectorOptions::Parallel));

FGCArrayPool::Get().ReturnToPool(ArrayStruct);

#if UE_BUILD_DEBUG
FGCArrayPool::Get().CheckLeaks();
#endif
}

virtual void PerformReachabilityAnalysisOnObjects(FGCArrayStruct* ArrayStruct, const EFastReferenceCollectorOptions InOptions) override
{
(this->*ReachabilityAnalysisFunctions[GetGCFunctionIndex(InOptions)])(ArrayStruct);
}

函数前半部分的主要工作是为之后的函数准备了一个名为 ObjectsToSerialize 的数组,并且将 Referencer 也加入到了该数组中,
在后面的分析中我们会知道,该数组就是用于存放可达对象的。

同时也可以看出,这里申请数组的方式使用的是 FGCArrayPool::Get().GetArrayStructFromPool(),顾名思义,这应该是 GC 内部的一种类似于对象池或者内存池的结构,由 FGCArrayPool 统一申请与分配数组空间,避免了随意的内存分配带来的很多问题。

1
2
3
4
5
6
7
8
9
10
11
12
/** Growing array of objects that require serialization */
FGCArrayStruct* ArrayStruct = FGCArrayPool::Get().GetArrayStructFromPool();
TArray<UObject*>& ObjectsToSerialize = ArrayStruct->ObjectsToSerialize;

// Reset object count.
GObjectCountDuringLastMarkPhase.Reset();

// Make sure GC referencer object is checked for references to other objects even if it resides in permanent object pool
if (FPlatformProperties::RequiresCookedData() && FGCObject::GGCObjectReferencer && GUObjectArray.IsDisregardForGC(FGCObject::GGCObjectReferencer))
{
ObjectsToSerialize.Add(FGCObject::GGCObjectReferencer);
}

之后将该数组的引用作为参数传递给了一个函数,引用传递也暗示我们,这个函数很有可能在执行过程中向这个数组中填充对象。这部分的语法比较难以理解,让我们一点点地拆分。

1
2
3
4
5
6
7
{
const double StartTime = FPlatformTime::Seconds();
// Mark phase doesn't care about PendingKill being enabled or not so there's just fewer compiled in functions
const EFastReferenceCollectorOptions OptionsForMarkPhase = InOptions & ~EFastReferenceCollectorOptions::WithPendingKill;
(this->*MarkObjectsFunctions[GetGCFunctionIndex(OptionsForMarkPhase)])(ObjectsToSerialize, KeepFlags);
UE_LOG(LogGarbage, Verbose, TEXT("%f ms for MarkObjectsAsUnreachable Phase (%d Objects To Serialize)"), (FPlatformTime::Seconds() - StartTime) * 1000, ObjectsToSerialize.Num());
}

首先,MarkObjectsFunctions 一定是一个成员变量,且它本身是一个 函数指针数组,因此才能够使用下面这样的方式来调用。其次,GetGCFunctionIndex(OptionsForMarkPhase) 的作用是控制该选用数组中的哪一个函数。

1
(this->*MarkObjectsFunctions[GetGCFunctionIndex(OptionsForMarkPhase)])(ObjectsToSerialize, KeepFlags);

类似的用法在后面也会有所体现,因此这里就一并分析了。在 FRealtimeGC 这个类的最上方,可以看到定义了这样的两个数组:

1
2
3
4
/** Pointers to functions used for Marking objects as unreachable */
MarkObjectsFn MarkObjectsFunctions[4];
/** Pointers to functions used for Reachability Analysis */
ReachabilityAnalysisFn ReachabilityAnalysisFunctions[8];

从名字与注释不难看出,这两个数组分别是对应 标记对象可达性分析 两个功能的函数指针数组,并且在类的构造函数中初始化了这两个数组。

进一步观察可以发现,数组中存放的内容是 MarkObjectsAsUnreachable()PerformReachabilityAnalysisOnObjectsInternal() 两个函数模板不同参数下的实例化函数,分别负责 ParallelWithClustersWithClusters 组合出的不同条件下的 GC 功能。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
/** Default constructor, initializing all members. */
FRealtimeGC()
{
MarkObjectsFunctions[GetGCFunctionIndex(EFastReferenceCollectorOptions::None)] = &FRealtimeGC::MarkObjectsAsUnreachable<false, false>;
MarkObjectsFunctions[GetGCFunctionIndex(EFastReferenceCollectorOptions::Parallel | EFastReferenceCollectorOptions::None)] = &FRealtimeGC::MarkObjectsAsUnreachable<true, false>;
MarkObjectsFunctions[GetGCFunctionIndex(EFastReferenceCollectorOptions::None | EFastReferenceCollectorOptions::WithClusters)] = &FRealtimeGC::MarkObjectsAsUnreachable<false, true>;
MarkObjectsFunctions[GetGCFunctionIndex(EFastReferenceCollectorOptions::Parallel | EFastReferenceCollectorOptions::WithClusters)] = &FRealtimeGC::MarkObjectsAsUnreachable<true, true>;

ReachabilityAnalysisFunctions[GetGCFunctionIndex(EFastReferenceCollectorOptions::None)] = &FRealtimeGC::PerformReachabilityAnalysisOnObjectsInternal<EFastReferenceCollectorOptions::None | EFastReferenceCollectorOptions::None>;
ReachabilityAnalysisFunctions[GetGCFunctionIndex(EFastReferenceCollectorOptions::Parallel | EFastReferenceCollectorOptions::None)] = &FRealtimeGC::PerformReachabilityAnalysisOnObjectsInternal<EFastReferenceCollectorOptions::Parallel | EFastReferenceCollectorOptions::None>;
ReachabilityAnalysisFunctions[GetGCFunctionIndex(EFastReferenceCollectorOptions::None | EFastReferenceCollectorOptions::WithClusters)] = &FRealtimeGC::PerformReachabilityAnalysisOnObjectsInternal<EFastReferenceCollectorOptions::None | EFastReferenceCollectorOptions::WithClusters>;
ReachabilityAnalysisFunctions[GetGCFunctionIndex(EFastReferenceCollectorOptions::Parallel | EFastReferenceCollectorOptions::WithClusters)] = &FRealtimeGC::PerformReachabilityAnalysisOnObjectsInternal<EFastReferenceCollectorOptions::Parallel | EFastReferenceCollectorOptions::WithClusters>;

ReachabilityAnalysisFunctions[GetGCFunctionIndex(EFastReferenceCollectorOptions::None | EFastReferenceCollectorOptions::WithPendingKill)] = &FRealtimeGC::PerformReachabilityAnalysisOnObjectsInternal<EFastReferenceCollectorOptions::None | EFastReferenceCollectorOptions::None | EFastReferenceCollectorOptions::WithPendingKill>;
ReachabilityAnalysisFunctions[GetGCFunctionIndex(EFastReferenceCollectorOptions::Parallel | EFastReferenceCollectorOptions::None | EFastReferenceCollectorOptions::WithPendingKill)] = &FRealtimeGC::PerformReachabilityAnalysisOnObjectsInternal<EFastReferenceCollectorOptions::Parallel | EFastReferenceCollectorOptions::None | EFastReferenceCollectorOptions::WithPendingKill>;
ReachabilityAnalysisFunctions[GetGCFunctionIndex(EFastReferenceCollectorOptions::None | EFastReferenceCollectorOptions::WithClusters | EFastReferenceCollectorOptions::WithPendingKill)] = &FRealtimeGC::PerformReachabilityAnalysisOnObjectsInternal<EFastReferenceCollectorOptions::None | EFastReferenceCollectorOptions::WithClusters | EFastReferenceCollectorOptions::WithPendingKill>;
ReachabilityAnalysisFunctions[GetGCFunctionIndex(EFastReferenceCollectorOptions::Parallel | EFastReferenceCollectorOptions::WithClusters | EFastReferenceCollectorOptions::WithPendingKill)] = &FRealtimeGC::PerformReachabilityAnalysisOnObjectsInternal<EFastReferenceCollectorOptions::Parallel | EFastReferenceCollectorOptions::WithClusters | EFastReferenceCollectorOptions::WithPendingKill>;

}

小结

现在我们可以先做一个小结:

UE4 的 GC 使用的是 标记-清扫方式。一次垃圾回收分为标记与清除两个阶段。第一阶段从一个根集合出发,遍历所有可达对象,标记处可达与不可达对象。第二阶段会渐进式的清理这些不可达对象,因为不可达的对象将永远不能被访问到,所以可以分帧清理它们。

具体而言,由 GC 的函数入口 CollectGarbage() 先给线程上锁,再调用 CollectGarbageInternal() 函数,该函数的逻辑可分为标记与清扫两个阶段。

其中,标记阶段调用了 PerformReachabilityAnalysis() 用于执行可达性分析,可达性分析可以分为 标记已有对象从这些对象出发以标记更多对象 两个步骤,在 FRealtimeGC 这个类中使用了两个 函数指针数组 来实现是否为 Parallel、 是否WithClusters、是否 WithClusters 的多种设置下的 GC 逻辑。数组内部存放的是 MarkObjectsAsUnreachable()PerformReachabilityAnalysisOnObjectsInternal() 两个模板函数在不同的参数下实例化出的函数对应的函数指针。

下一节将深入分析这些函数。