DP | nex3z's blog

[RL Notes] 时序差分学习的优势

Author: nex3z 2019-10-27

　　时序差分（TD）学习结合了动态规划（DP）和蒙特卡洛（MC）方法的关键思想，主要有以下优势： TD 像 MC 一样不需要环境模型，可以直接从经验中学习；而 DP 需要环境模型。 TD 像 DP 一样可以自举，而 MC 无法自举。 TD 可以在线增量地更新，DP 和 MC 都无法做到这一点。 TD 可以渐进地收敛到正确的预测值，而且通常收敛得比 MC 快。

Reinforcement Learning

DP, MC, Reinforcement Learning, TD

1. 题目 http://acm.timus.ru/problem.aspx?space=1&num=1183 1183. Brackets Sequence Time limit: 1.0 second Memory limit: 64 MB Let us define a regular brackets sequence in the following way: Empty seq…
Read more

ACM

ACM, C/C++, DP, String

URAL 1238. Folding

Author: nex3z 2015-10-31

1. 题目 http://acm.timus.ru/problem.aspx?space=1&num=1238 1238. Folding Time limit: 1.0 second Memory limit: 64 MB Bill is trying to compactly represent sequences of capital alphabetic characters fr…
Read more

ACM

ACM, C/C++, DP

URAL 1152. False Mirrors

Author: nex3z 2015-10-31

1. 题目 http://acm.timus.ru/problem.aspx?space=1&num=1152 1152. False Mirrors Time limit: 2.0 second Memory limit: 64 MB Background We wandered in the labyrinth for twenty minutes before finally ent…
Read more

ACM

ACM, C/C++, DFS, DP

URAL 1039. Anniversary Party

Author: nex3z 2015-10-31

1. 题目 http://acm.timus.ru/problem.aspx?space=1&num=1039 1039. Anniversary Party Time limit: 0.5 second Memory limit: 8 MB Background The president of the Ural State University is going to make an …
Read more

ACM

ACM, C/C++, DP

POJ 1463. Strategic game

Author: nex3z 2015-10-30

1. 题目 http://poj.org/problem?id=1463 Strategic game Time Limit: 2000MS Memory Limit: 10000K Total Submissions: 7290 Accepted: 3379 Description Bob enjoys playing computer games, especially strategic g…
Read more

ACM

ACM, C/C++, DP

最大图样问题

Author: nex3z 2015-10-26

　　在最大正方形问题一篇中介绍了寻找最大正方形的方法，同样的方法稍加变化，即可用于寻找最大图样。 0. 问题描述　　有M*N个正方形格子，其中一些格子涂成了蓝色，如图1中左图所示。求蓝色格子所能构成的最大十字的高度。图1中左图蓝色格子所能构成的最大十字高度为5，标记为深蓝色，如图1中右图。 1. 思路　　十字不同于正方形，难以将寻找大十字的问题分解为寻找小十字的问题，那么考虑将十字拆分成为更加…
Read more

ACM

ACM, DP

最大正方形问题

Author: nex3z 2015-10-26

0. 问题描述　　有M*N个正方形格子，其中一些格子涂成了蓝色，如图1所示。求蓝色格子所能构成的最大正方形边长。图1中蓝色格子所能构成的最大正方形边长为3。 1. 预处理　　使用g[i][j]表示以第i行第j列的格子为右下角，所能构成的最大正方形的边长。对g[i][j]进行初始化：如果格子为白色，令g[i][j] = 0；如果格子为蓝色，令g[i][j] = 1。　　得到结果如图2。注意…
Read more

ACM

ACM, DP

POJ 3624. Charm Bracelet

Author: nex3z 2015-10-09

1. 题目 http://poj.org/problem?id=3624 Charm Bracelet Time Limit: 1000MS Memory Limit: 65536K Total Submissions: 28195 Accepted: 12696 Description Bessie has gone to the mall’s jewelry store and s…
Read more

ACM

ACM, C/C++, DP

POJ 1384. Piggy-Bank

Author: nex3z 2015-10-09

1. 题目 http://poj.org/problem?id=1384 Piggy-Bank Time Limit: 1000MS Memory Limit: 10000K Total Submissions: 9043 Accepted: 4413 Description Before ACM can do anything, a budget must be prepared and the…
Read more

Uncategorized

ACM, C/C++, DP

Tag Archive: DP

[RL Notes] 时序差分学习的优势

URAL 1183. Brackets Sequence

URAL 1238. Folding

URAL 1152. False Mirrors

URAL 1039. Anniversary Party

POJ 1463. Strategic game

最大图样问题

最大正方形问题

POJ 3624. Charm Bracelet

POJ 1384. Piggy-Bank

Post navigation

2024年 7月
一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31