Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[multi-catalog](hudi) impl hudi_metadata table value function #46137

Merged
merged 12 commits into from
Jan 3, 2025

Conversation

suxiaogang223
Copy link
Contributor

@suxiaogang223 suxiaogang223 commented Dec 30, 2024

What problem does this PR solve?

Release note

impl hudi_meta tvf to query hudi table metadata

select * from hudi_meta("table"="hive_krb.regression_hudi.timetravel_cow","query_type" = "timeline");
+-------------------+--------+--------------------------+-----------+-----------------------+
| timestamp         | action | file_name                | state     | state_transition_time |
+-------------------+--------+--------------------------+-----------+-----------------------+
| 20240724195843565 | commit | 20240724195843565.commit | COMPLETED | 20240724195844269     |
| 20240724195845718 | commit | 20240724195845718.commit | COMPLETED | 20240724195846653     |
| 20240724195848377 | commit | 20240724195848377.commit | COMPLETED | 20240724195849337     |
| 20240724195850799 | commit | 20240724195850799.commit | COMPLETED | 20240724195851676     |
+-------------------+--------+--------------------------+-----------+-----------------------+
4 rows in set (0.03 sec)

see doc apache/doris-website#1673

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Dec 30, 2024

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add test cases

gensrc/thrift/Types.thrift Outdated Show resolved Hide resolved
@suxiaogang223
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.90% (10122/26018)
Line Coverage: 29.89% (85513/286117)
Region Coverage: 29.02% (43717/150619)
Branch Coverage: 25.56% (22301/87260)
Coverage Report: http://coverage.selectdb-in.cc/coverage/29aed7d21a0a83bffd49dbf9d881d186ae408865_29aed7d21a0a83bffd49dbf9d881d186ae408865/report/index.html

@suxiaogang223
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32788 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit de5a28cbe75f6a0fbae826970c44d9dfb7f8e500, data reload: false

------ Round 1 ----------------------------------
q1	17586	6148	6084	6084
q2	2050	298	169	169
q3	10469	1226	730	730
q4	10224	888	430	430
q5	7825	2225	2050	2050
q6	206	183	146	146
q7	894	767	608	608
q8	9227	1400	1221	1221
q9	5293	4890	5003	4890
q10	6762	2313	1899	1899
q11	483	270	256	256
q12	363	354	227	227
q13	17772	3584	2977	2977
q14	234	233	203	203
q15	564	513	499	499
q16	629	597	604	597
q17	561	862	324	324
q18	7217	6506	6545	6506
q19	1917	967	559	559
q20	313	306	184	184
q21	2897	2165	1930	1930
q22	366	337	299	299
Total cold run time: 103852 ms
Total hot run time: 32788 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6286	6222	6208	6208
q2	238	326	229	229
q3	2267	2640	2341	2341
q4	1400	1830	1381	1381
q5	4377	4787	4864	4787
q6	191	182	143	143
q7	2088	1966	1817	1817
q8	2665	2823	2711	2711
q9	7389	7292	7343	7292
q10	3077	3361	2816	2816
q11	565	509	497	497
q12	662	767	636	636
q13	3406	3722	3060	3060
q14	288	323	278	278
q15	565	525	505	505
q16	656	689	653	653
q17	1232	1788	1253	1253
q18	7622	7325	7427	7325
q19	814	1114	1119	1114
q20	2076	2096	1933	1933
q21	5789	5329	4942	4942
q22	603	627	578	578
Total cold run time: 54256 ms
Total hot run time: 52499 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 198017 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit de5a28cbe75f6a0fbae826970c44d9dfb7f8e500, data reload: false

query1	1313	982	935	935
query2	6364	2392	2323	2323
query3	11104	4982	5086	4982
query4	33307	23930	23908	23908
query5	5041	621	466	466
query6	287	198	205	198
query7	3990	488	302	302
query8	325	236	225	225
query9	9265	2753	2733	2733
query10	481	308	239	239
query11	18223	15404	15194	15194
query12	166	115	107	107
query13	1597	527	420	420
query14	10479	7230	7361	7230
query15	270	210	189	189
query16	7988	605	469	469
query17	1526	793	572	572
query18	2174	401	317	317
query19	197	179	171	171
query20	124	116	116	116
query21	208	130	104	104
query22	4575	4652	4493	4493
query23	34669	33808	33605	33605
query24	6470	2279	2352	2279
query25	487	465	398	398
query26	771	270	159	159
query27	2238	463	332	332
query28	5676	2514	2513	2513
query29	649	567	425	425
query30	203	181	148	148
query31	1013	918	863	863
query32	89	59	58	58
query33	471	362	299	299
query34	775	851	533	533
query35	825	829	815	815
query36	1013	1042	964	964
query37	113	96	73	73
query38	4301	4449	4286	4286
query39	1517	1496	1447	1447
query40	206	118	103	103
query41	44	42	44	42
query42	123	103	113	103
query43	525	519	495	495
query44	1377	840	822	822
query45	186	175	174	174
query46	906	1059	664	664
query47	1999	1978	1987	1978
query48	390	401	335	335
query49	716	512	403	403
query50	645	687	405	405
query51	7265	7245	7140	7140
query52	105	108	111	108
query53	238	260	198	198
query54	521	501	456	456
query55	86	78	80	78
query56	267	265	244	244
query57	1215	1264	1165	1165
query58	226	218	233	218
query59	3234	3255	3144	3144
query60	273	288	261	261
query61	109	104	104	104
query62	879	826	756	756
query63	227	185	201	185
query64	3330	1061	672	672
query65	3347	3288	3298	3288
query66	796	427	311	311
query67	16703	15875	15667	15667
query68	10079	766	549	549
query69	531	299	256	256
query70	1208	1136	1130	1130
query71	432	295	276	276
query72	6239	4106	3846	3846
query73	676	744	360	360
query74	10363	9242	8828	8828
query75	4554	3168	2675	2675
query76	5189	1203	783	783
query77	937	354	278	278
query78	10120	10360	9381	9381
query79	2728	910	621	621
query80	707	525	430	430
query81	490	278	235	235
query82	632	155	123	123
query83	191	168	144	144
query84	283	93	75	75
query85	782	348	354	348
query86	349	315	298	298
query87	4393	4454	4378	4378
query88	3332	2248	2218	2218
query89	427	331	301	301
query90	1877	187	192	187
query91	133	133	102	102
query92	63	52	50	50
query93	1130	898	545	545
query94	654	387	296	296
query95	335	265	259	259
query96	478	605	283	283
query97	2754	2817	2702	2702
query98	223	207	205	205
query99	1674	1571	1446	1446
Total cold run time: 300185 ms
Total hot run time: 198017 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.14 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit de5a28cbe75f6a0fbae826970c44d9dfb7f8e500, data reload: false

query1	0.04	0.04	0.05
query2	0.08	0.03	0.03
query3	0.24	0.06	0.07
query4	1.63	0.10	0.10
query5	0.41	0.42	0.39
query6	1.14	0.66	0.64
query7	0.02	0.01	0.01
query8	0.04	0.02	0.04
query9	0.59	0.50	0.50
query10	0.57	0.57	0.56
query11	0.15	0.11	0.11
query12	0.14	0.11	0.12
query13	0.60	0.60	0.60
query14	2.72	2.88	2.86
query15	0.89	0.83	0.83
query16	0.39	0.37	0.37
query17	1.01	0.99	1.06
query18	0.19	0.19	0.20
query19	1.91	1.82	2.01
query20	0.01	0.01	0.01
query21	15.38	0.91	0.59
query22	0.74	0.87	0.61
query23	15.28	1.34	0.55
query24	3.42	1.79	2.17
query25	0.14	0.14	0.09
query26	0.22	0.14	0.14
query27	0.05	0.06	0.05
query28	14.61	1.41	1.04
query29	12.57	3.93	3.26
query30	0.24	0.08	0.06
query31	2.83	0.58	0.38
query32	3.22	0.53	0.47
query33	3.11	3.05	3.16
query34	17.01	5.13	4.55
query35	4.54	4.46	4.56
query36	0.64	0.49	0.50
query37	0.09	0.06	0.06
query38	0.04	0.04	0.04
query39	0.04	0.03	0.02
query40	0.17	0.13	0.13
query41	0.07	0.02	0.02
query42	0.04	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 107.26 s
Total hot run time: 32.14 s

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.89% (10121/26025)
Line Coverage: 29.88% (85498/286173)
Region Coverage: 29.01% (43708/150680)
Branch Coverage: 25.54% (22300/87304)
Coverage Report: http://coverage.selectdb-in.cc/coverage/de5a28cbe75f6a0fbae826970c44d9dfb7f8e500_de5a28cbe75f6a0fbae826970c44d9dfb7f8e500/report/index.html

@suxiaogang223
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32654 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit bbe8acfc8837e947c128bd714262fdea3ed2aebb, data reload: false

------ Round 1 ----------------------------------
q1	17916	6329	6114	6114
q2	2050	323	165	165
q3	10863	1263	750	750
q4	10430	865	436	436
q5	9092	2214	1975	1975
q6	208	182	149	149
q7	889	752	589	589
q8	9246	1356	1202	1202
q9	5334	4961	4959	4959
q10	6765	2329	1878	1878
q11	486	278	257	257
q12	352	369	226	226
q13	17775	3535	2979	2979
q14	233	235	224	224
q15	564	510	496	496
q16	629	620	575	575
q17	567	865	329	329
q18	7173	6399	6305	6305
q19	1885	973	572	572
q20	307	320	186	186
q21	2865	2176	1980	1980
q22	365	335	308	308
Total cold run time: 105994 ms
Total hot run time: 32654 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6341	6259	6233	6233
q2	241	323	227	227
q3	2222	2650	2320	2320
q4	1425	1848	1406	1406
q5	4376	4746	4959	4746
q6	187	176	145	145
q7	2130	1969	1872	1872
q8	2622	2776	2714	2714
q9	7353	7337	7347	7337
q10	3102	3355	2748	2748
q11	571	524	507	507
q12	708	769	638	638
q13	3419	3783	3095	3095
q14	301	316	285	285
q15	578	528	512	512
q16	679	690	651	651
q17	1250	1727	1266	1266
q18	7654	7607	7320	7320
q19	816	1078	1132	1078
q20	2039	2045	1851	1851
q21	5646	5180	4986	4986
q22	613	606	594	594
Total cold run time: 54273 ms
Total hot run time: 52531 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.89% (10120/26025)
Line Coverage: 29.88% (85523/286176)
Region Coverage: 29.00% (43704/150684)
Branch Coverage: 25.54% (22297/87306)
Coverage Report: http://coverage.selectdb-in.cc/coverage/bbe8acfc8837e947c128bd714262fdea3ed2aebb_bbe8acfc8837e947c128bd714262fdea3ed2aebb/report/index.html

@doris-robot
Copy link

TPC-DS: Total hot run time: 197384 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit bbe8acfc8837e947c128bd714262fdea3ed2aebb, data reload: false

query1	1319	932	919	919
query2	6386	2423	2398	2398
query3	11121	4803	5056	4803
query4	32931	23735	23368	23368
query5	3676	632	436	436
query6	292	197	187	187
query7	3985	494	301	301
query8	303	249	252	249
query9	9442	2713	2699	2699
query10	456	313	243	243
query11	17852	15361	15233	15233
query12	158	109	106	106
query13	1569	551	409	409
query14	10366	8050	8031	8031
query15	258	204	212	204
query16	8519	653	475	475
query17	1625	763	588	588
query18	1969	399	317	317
query19	220	193	155	155
query20	120	118	111	111
query21	204	130	107	107
query22	4850	4586	4642	4586
query23	34677	33986	33429	33429
query24	6614	2319	2298	2298
query25	464	458	411	411
query26	758	280	159	159
query27	2028	481	338	338
query28	5356	2489	2440	2440
query29	588	542	416	416
query30	216	180	148	148
query31	987	931	829	829
query32	77	63	57	57
query33	500	343	299	299
query34	776	860	502	502
query35	828	910	792	792
query36	1035	1091	993	993
query37	114	95	73	73
query38	4198	4157	4136	4136
query39	1550	1467	1499	1467
query40	205	114	109	109
query41	46	47	44	44
query42	115	106	110	106
query43	532	546	528	528
query44	1289	819	826	819
query45	184	181	180	180
query46	899	1085	659	659
query47	2017	2016	1918	1918
query48	388	405	347	347
query49	710	509	417	417
query50	652	664	388	388
query51	7442	7263	7277	7263
query52	97	101	96	96
query53	228	255	192	192
query54	480	505	415	415
query55	84	83	83	83
query56	259	254	254	254
query57	1277	1238	1201	1201
query58	246	227	218	218
query59	3196	3301	3178	3178
query60	276	264	252	252
query61	112	115	108	108
query62	846	830	757	757
query63	233	196	196	196
query64	2990	1024	659	659
query65	3332	3229	3247	3229
query66	802	423	303	303
query67	16621	15867	15556	15556
query68	9297	752	503	503
query69	490	299	248	248
query70	1265	1196	1125	1125
query71	443	289	253	253
query72	6361	3836	3870	3836
query73	666	749	358	358
query74	10444	9306	8899	8899
query75	4529	3172	2617	2617
query76	5319	1196	754	754
query77	949	359	293	293
query78	10113	10295	9373	9373
query79	4951	952	598	598
query80	731	515	426	426
query81	479	269	229	229
query82	327	150	125	125
query83	198	171	144	144
query84	280	91	73	73
query85	743	371	299	299
query86	349	332	275	275
query87	4591	4724	4298	4298
query88	3726	2239	2200	2200
query89	448	338	299	299
query90	2132	182	186	182
query91	132	132	102	102
query92	68	61	54	54
query93	3097	873	529	529
query94	659	394	322	322
query95	338	266	253	253
query96	488	595	278	278
query97	2674	2843	2625	2625
query98	236	208	195	195
query99	1636	1576	1432	1432
Total cold run time: 302104 ms
Total hot run time: 197384 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.57 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit bbe8acfc8837e947c128bd714262fdea3ed2aebb, data reload: false

query1	0.04	0.03	0.03
query2	0.09	0.04	0.04
query3	0.23	0.06	0.06
query4	1.63	0.10	0.10
query5	0.42	0.40	0.41
query6	1.15	0.68	0.66
query7	0.02	0.02	0.01
query8	0.05	0.05	0.05
query9	0.55	0.49	0.50
query10	0.56	0.57	0.55
query11	0.17	0.12	0.12
query12	0.16	0.13	0.12
query13	0.61	0.60	0.60
query14	2.71	2.73	2.72
query15	0.90	0.84	0.85
query16	0.37	0.38	0.39
query17	1.07	1.06	1.01
query18	0.18	0.18	0.19
query19	1.85	1.89	2.04
query20	0.01	0.01	0.02
query21	15.35	0.95	0.66
query22	0.76	0.80	0.69
query23	14.97	1.46	0.69
query24	2.22	0.37	0.21
query25	0.15	0.08	0.09
query26	0.29	0.19	0.18
query27	0.08	0.08	0.08
query28	13.41	1.73	1.13
query29	12.62	4.08	3.37
query30	0.25	0.08	0.06
query31	2.85	0.59	0.40
query32	3.22	0.56	0.47
query33	3.11	3.12	3.15
query34	16.42	5.16	4.54
query35	4.55	4.57	4.62
query36	0.61	0.49	0.46
query37	0.19	0.15	0.16
query38	0.16	0.15	0.16
query39	0.05	0.04	0.04
query40	0.17	0.14	0.13
query41	0.10	0.06	0.04
query42	0.06	0.05	0.05
query43	0.05	0.04	0.05
Total cold run time: 104.41 s
Total hot run time: 31.57 s

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 2, 2025
Copy link
Contributor

github-actions bot commented Jan 2, 2025

PR approved by at least one committer and no changes requested.

Copy link
Contributor

github-actions bot commented Jan 2, 2025

PR approved by anyone and no changes requested.

KassieZ added a commit to apache/doris-website that referenced this pull request Jan 2, 2025
docs about apache/doris#46137
## Versions 

- [x] dev
- [x] 3.0
- [x] 2.1
- [ ] 2.0

## Languages

- [x] Chinese
- [x] English

## Docs Checklist

- [ ] Checked by AI
- [ ] Test Cases Built

---------

Co-authored-by: KassieZ <[email protected]>
@morningman morningman merged commit e9f18c5 into apache:master Jan 3, 2025
23 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.1.x-experimental dev/3.0.x-experimental reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants